Xylose isomerases and their uses

ABSTRACT

This disclosure relates to novel xylose isomerases and their uses, particularly in fermentation processes that employ xylose-containing media.

1. BACKGROUND

The efficient, commercial production of biofuels from plant material,such as sugarcane, requires the fermentation of pentoses, such asxylose. Xylose in plant material typically comes from lignocellulose,which is a matrix composed of cellulose, hemicelluloses, and lignin.Lignocellulose is broken down either by acid hydrolysis or enzymaticreaction, yielding xylose in addition to other monosaccharides, such asglucose (Maki et al., 2009, Int. J. Biol. Sci. 5:500-516).

Fungi, especially Saccharomyces cerevisiae, are commercially relevantmicroorganisms that ferment sugars into biofuels such as ethanol.However, S. cerevisiae does not endogenously metabolize xylose,requiring genetic modifications that allow it to convert xylose intoxylulose. Other organisms, whose usefulness in ethanol production islimited, are able to metabolize xylose (Nevigot, 2008, Micobiol. Mol.Biol. Rev. 72:379-412).

Two pathways have been identified for the metabolism of xylose toxylulose in microorganisms: the xylose reductase (XR, EC1.1.1.307)/xylitol dehydrogenase (XDH, EC 1.1.1.9, 1.1.1.10 and1.1.1.B19) pathway and the xylose isomerase (XI, EC 5.3.1.5) pathway.Use of the XR/XDH pathway for xylose metabolism creates an imbalance ofcofactors (excess NADH and NADP+) limiting the potential output of thispathway for the production of ethanol. The XI pathway, on the otherhand,converts xylose to xylulose in a single step and does not create acofactor imbalance (Young et al., 2010, Biotechnol. Biofuels 3:24-36).

Because S. cerevisiae does not possess a native XI, it has beendesirable to search for an XI in another organism to insert into S.cerevisiae for the purpose of biofuels production. Several XI genes havebeen discovered, although little or no enzymatic activity uponexpression in S. cerevisiae has been a common problem. The XI fromPiromyces sp. E2 was the first heterologously expressed XI in S.cerevisiae whose enzymatic activity could be observed (WO 03/062430).

2. SUMMARY

Due to the physiology of S. cerevisiae and the process of commercialbiofuel production, there are other characteristics besides activitythat are valuable in a commercially useful XI. During fermentation, thepH of the yeast cell and its environment can become more acidic (Rosaand Sa-Correia, 1991, Appl. Environ. Microbiol. 57:830-835). The abilityof the XI to function in an acidic environment is therefore highlydesirable. Therefore, there is a still a need in the art for XI enzymeswith enhanced activity to convert xylose to xylulose for biofuelsproduction under a broader range of commercially relevant conditions.

The present disclosure relates to novel xylose isomerases. The xyloseisomerases have desirable characteristics for xylose fermentation, suchas high activity, tolerance to acidic conditions (i.e., pH levels below7, e.g., pH 6.5 or pH 6), or both.

The present disclosure has multiple aspects. In one aspect, thedisclosure is directed to XI polypeptides. The polypeptides of thedisclosure typically comprise amino acid sequences having at least 70%,75%, 80%, 85%, 90%, 93%, 95%, 96%, 98%, 99% or 100% sequence identity toany of the XI polypeptides of Table 1, or the catalytic domain ordimerization domain thereof, or are encoded by nucleic acid sequencescomprising nucleotide sequences having at least 70%, 75%, 80%, 85%, 90%,93%, 95%, 96%, 98%, 99% or 100% sequence identity to any of the nucleicacids of Table 1:

TABLE 1 SEQ Organism Type of Catalytic Dimerization ID NO: Clone No.Classification Sequence Domain Domain 1 1754MI2_001 Bacteroidales DNA 21754MI2_001 Bacteroidales Amino Acid 2-376 377-437 3 5586MI6_004Bacteroidales DNA 4 5586MI6_004 Bacteroidales Amino Acid 2-376 377-437 55749MI1_003 Bacteroidales DNA 6 5749MI1_003 Bacteroidales Amino Acid2-381 382-442 7 5750MI1_003 Bacteroidales DNA 8 5750MI1_003Bacteroidales Amino Acid 2-381 382-442 9 5750MI2_003 Bacteroidales DNA10 5750MI2_003 Bacteroidales Amino Acid 2-381 382-442 11 5586MI5_004Bacteroides DNA 12 5586MI5_004 Bacteroides Amino Acid 2-375 376-435 135586MI202_004 Bacteroides DNA 14 5586MI202_004 Bacteroides Amino Acid2-377 378-438 15 5586MI211_003 Bacteroides DNA 16 5586MI211_003Bacteroides Amino Acid 2-376 377-437 17 5606MI1_005 Bacteroides DNA 185606MI1_005 Bacteroides Amino Acid 2-377 378-438 19 5606MI2_003Bacteroides DNA 20 5606MI2_003 Bacteroides Amino Acid 2-378 379-439 215610MI3_003 Bacteroides DNA 22 5610MI3_003 Bacteroides Amino Acid 2-377378-439 23 5749MI2_004 Bacteroides DNA 24 5749MI2_004 Bacteroides AminoAcid 2-377 378-438 25 5750MI3_003 Bacteroides DNA 26 5750MI3_003Bacteroides Amino Acid 2-377 378-438 27 5750MI4_003 Bacteroides DNA 285750MI4_003 Bacteroides Amino Acid 2-377 378-438 29 5751MI4_002Bacteroides DNA 30 5751MI4_002 Bacteroides Amino Acid 2-376 377-437 315751MI5_003 Bacteroides DNA 32 5751MI5_003 Bacteroides Amino Acid 2-377378-438 33 5751MI6_004 Bacteroides DNA 34 5751MI6_004 Bacteroides AminoAcid 2-377 378-438 35 5586MI22_003 Clostridiales DNA 36 5586MI22_003Clostridiales Amino Acid 2-375 376-439 37 1753MI4_001 Firmicutes DNA 381753MI4_001 Firmicutes Amino Acid 2-374 375-440 39 1753MI6_001Firmicutes DNA 40 1753MI6_001 Firmicutes Amino Acid 2-374 375-440 411753MI35_004 Firmicutes DNA 42 1753MI35_004 Firmicutes Amino Acid 2-375376-441 43 1754MI9_004 Firmicutes DNA 44 1754MI9_004 Firmicutes AminoAcid 2-375 376-440 45 1754MI22_004 Firmicutes DNA 46 1754MI22_004Firmicutes Amino Acid 2-375 376-440 47 727MI1_002 Firmicutes DNA 48727MI1_002 Firmicutes Amino Acid 2-372 373-436 49 727MI9_005 FirmicutesDNA 50 727MI9_005 Firmicutes Amino Acid 2-374 375-438 51 727MI27_002Firmicutes DNA 52 727MI27_002 Firmicutes Amino Acid 2-374 375-439 531753MI2_006 Neocallimastigales DNA 54 1753MI2_006 NeocallimastigalesAmino Acid 2-376 377-437 55 5586MI3_005 Neocallimastigales DNA 565586MI3_005 Neocallimastigales Amino Acid 2-376 377-437 57 5586MI91_002Neocallimastigales DNA 58 5586MI91_002 Neocallimastigales Amino Acid2-376 377-437 59 5586MI194_003 Neocallimastigales DNA 60 5586MI194_003Neocallimastigales Amino Acid 2-376 377-438 61 5586MI198_003Neocallimastigales DNA 62 5586MI198_003 Neocallimastigales Amino Acid2-375 376-437 63 5586MI201_003 Neocallimastigales DNA 64 5586MI201_003Neocallimastigales Amino Acid 2-376 377-438 65 5586MI204_002Neocallimastigales DNA 66 5586M1204_002 Neocallimastigales Amino Acid2-375 376-437 67 5586MI207_002 Neocallimastigales DNA 68 5586MI207_002Neocallimastigales Amino Acid 2-375 376-437 69 5586MI209_003Neocallimastigales DNA 70 5586MI209_003 Neocallimastigales Amino Acid2-375 376-437 71 5586MI214_002 Neocallimastigales DNA 72 5586MI214_002Neocallimastigales Amino Acid 2-375 376-437 73 5751MI3_001Neocallimastigales DNA 74 5751MI3_001 Neocallimastigales Amino Acid2-375 376-437 75 5753MI3_002 Prevotella DNA 76 5753MI3_002 PrevotellaAmino Acid 2-376 377-439 77 1754MI1_001 Prevotella DNA 78 1754MI1_001Prevotella Amino Acid 2-377 378-439 79 1754MI3_007 Prevotella DNA 801754MI3_007 Prevotella Amino Acid 2-377 378-439 81 1754MI5_009Prevotella DNA 82 1754MI5_009 Prevotella Amino Acid 2-375 376-437 835586MI1_003 Prevotella DNA 84 5586MI1_003 Prevotella Amino Acid 2-377378-439 85 5586MI2_006 Prevotella DNA 86 5586MI2_006 Prevotella AminoAcid 2-377 378-439 87 5586MI8_003 Prevotella DNA 88 5586MI8_003Prevotella Amino Acid 2-377 378-439 89 5586MI14_003 Prevotella DNA 905586MI14_003 Prevotella Amino Acid 2-377 378-439 91 5586MI26_003Prevotella DNA 92 5586MI26_003 Prevotella Amino Acid 2-377 378-439 935586MI86_001 Prevotella DNA 94 5586MI86_001 Prevotella Amino Acid 2-376377-438 95 5586MI108_002 Prevotella DNA 96 5586MI108_002 PrevotellaAmino Acid 2-377 378-439 97 5586MI182_004 Prevotella DNA 985586MI182_004 Prevotella Amino Acid 2-377 378-439 99 5586MI193_004Prevotella DNA 100 5586MI193_004 Prevotella Amino Acid 2-376 377-438 1015586MI195_003 Prevotella DNA 102 5586MI195_003 Prevotella Amino Acid2-376 377-438 103 5586MI216_003 Prevotella DNA 104 5586MI216_003Prevotella Amino Acid 2-376 377-438 105 5586MI197_003 Prevotella DNA 1065586MI197_003 Prevotella Amino Acid 2-376 377-438 107 5586MI199_003Prevotella DNA 108 5586MI199_003 Prevotella Amino Acid 2-376 377-438 1095586MI200_003 Prevotella DNA 110 5586MI200_003 Prevotella Amino Acid2-376 377-438 111 5586MI203_003 Prevotella DNA 112 5586MI203_003Prevotella Amino Acid 2-376 377-438 113 5586MI205_004 Prevotella DNA 1145586MI205_004 Prevotella Amino Acid 2-376 377-438 115 5586MI206_004Prevotella DNA 116 5586MI206_004 Prevotella Amino Acid 2-376 377-438 1175586MI208_003 Prevotella DNA 118 5586MI208_003 Prevotella Amino Acid2-376 377-438 119 5586MI210_002 Prevotella DNA 120 5586MI210_002Prevotella Amino Acid 2-374 375-437 121 5586MI212_002 Prevotella DNA 1225586MI212_002 Prevotella Amino Acid 2-376 377-438 123 5586MI213_003Prevotella DNA 124 5586MI213_003 Prevotella Amino Acid 2-376 377-438 1255586MI215_003 Prevotella DNA 126 5586MI215_003 Prevotella Amino Acid2-376 377-438 127 5607MI1_003 Prevotella DNA 128 5607MI1_003 PrevotellaAmino Acid 2-376 377-438 129 5607MI2_003 Prevotella DNA 130 5607MI2_003Prevotella Amino Acid 2-376 377-442 131 5607MI3_003 Prevotella DNA 1325607MI3_003 Prevotella Amino Acid 2-376 377-438 133 5607MI4_005Prevotella DNA 134 5607MI4_005 Prevotella Amino Acid 2-376 377-438 1355607MI5_002 Prevotella DNA 136 5607MI5_002 Prevotella Amino Acid 2-376377-439 137 5607MI6_002 Prevotella DNA 138 5607MI6_002 Prevotella AminoAcid 2-376 377-438 139 5607MI7_002 Prevotella DNA 140 5607MI7_002Prevotella Amino Acid 2-376 377-438 141 5608MI1_004 Prevotella DNA 1425608MI1_004 Prevotella Amino Acid 2-376 377-438 143 5608MI2_002Prevotella DNA 144 5608MI2_002 Prevotella Amino Acid 2-375 376-437 1455608MI3_004 Prevotella DNA 146 5608MI3_004 Prevotella Amino Acid 2-376377-438 147 5609MI1_005 Prevotella DNA 148 5609MI1_005 Prevotella AminoAcid 2-376 377-438 149 5610MI1_003 Prevotella DNA 150 5610MI1_003Prevotella Amino Acid 2-376 377-438 151 5610MI2_004 Prevotella DNA 1525610MI2_004 Prevotella Amino Acid 2-376 377-438 153 5751MI1_003Prevotella DNA 154 5751MI1_003 Prevotella Amino Acid 2-376 377-438 1555751MI2_003 Prevotella DNA 156 5751MI2_003 Prevotella Amino Acid 2-376377-438 157 5752MI1_003 Prevotella DNA 158 5752MI1_003 Prevotella AminoAcid 2-376 377-438 159 5752MI2_003 Prevotella DNA 160 5752MI2_003Prevotella Amino Acid 2-376 377-438 161 5752MI3_002 Prevotella DNA 1625752MI3_002 Prevotella Amino Acid 2-376 377-438 163 5752MI5_003Prevotella DNA 164 5752MI5_003 Prevotella Amino Acid 2-376 377-438 1655752MI6_004 Prevotella DNA 166 5752MI6_004 Prevotella Amino Acid 2-376377-438 167 5753MI1_002 Prevotella DNA 168 5753MI1_002 Prevotella AminoAcid 2-376 377-438 169 5753MI2_002 Prevotella DNA 170 5753MI2_002Prevotella Amino Acid 2-376 377-438 171 5753MI4_002 Prevotella DNA 1725753MI4_002 Prevotella Amino Acid 2-376 377-438 173 5752MI4_004Prevotella DNA 174 5752MI4_004 Prevotella Amino Acid 2-376 377-438 175727MI4_006 Rhizobiales DNA 176 727MI4_006 Rhizobiales Amino Acid 2-373374-435

In specific embodiments, a polypeptide of the disclosure comprises anamino acid sequence having:

-   -   (1) (a) at least 97% or 98% sequence identity to SEQ ID NO:78 or        the catalytic domain thereof (amino acids 2-377 of SEQ ID NO:78)        and/or (b) at least 80%, 85%, 90%, 93% or 95% sequence identity        to SEQ ID NO:78 or the catalytic domain thereof (amino acids        2-377 of SEQ ID NO:78) and further comprises (i) SEQ ID NO:212        or SEQ ID NO:213 and/or (ii) SEQ ID NO:214;    -   (2) (a) at least 95%, 97% or 98% sequence identity to SEQ ID        NO:96 or the catalytic domain thereof (amino acids 2-377 of SEQ        ID NO:96) and/or (b) at least 80%, 85%, 90%, 93% or 95% sequence        identity to SEQ ID NO:96 or the catalytic domain thereof (amino        acids 2-377 of SEQ ID NO:96) and further comprises (i) SEQ ID        NO:212 or SEQ ID NO:213 and/or (ii) SEQ ID NO:214;    -   (3) at least 80%, 85%, 90%, 93%, 95%, 97% or 98% sequence        identity to SEQ ID NO:38 or the catalytic domain thereof (amino        acids 2-374 of SEQ ID NO:38), and optionally further comprises        one, two, three, four or all five of (i) SEQ ID NO:206 or SEQ ID        NO:207; (ii) SEQ ID NO:208; (iii) SEQ ID NO:209; (iv) SEQ ID        NO:210; and (iv) SEQ ID NO:211;    -   (4) at least 80%, 85%, 90%, 93%, 95%, 97% or 98% sequence        identity to SEQ ID NO:2 or the catalytic domain thereof (amino        acids 2-374 of SEQ ID NO:2);    -   (5) at least 93%, 95%, 97% or 98% sequence identity to SEQ ID        NO:58 or the catalytic domain thereof (amino acids 2-376 of SEQ        ID NO:58),    -   (6) at least 80%, 85%, 90%, 93%, 95%, 97% or 98% sequence        identity to SEQ ID NO:42 or the catalytic domain thereof (amino        acids 2-375 of SEQ ID NO:42), and optionally further comprises        one, two or all three of (i) SEQ ID NO:206 or SEQ ID        NO:207; (ii) SEQ ID NO:210; and (iii) SEQ ID NO:211;    -   (7) (a) at least 97% or 98% sequence identity to SEQ ID NO:84 or        the catalytic domain thereof (amino acids 2-376 of SEQ ID        NO:84), and/or (b) at least 80%, 85%, 90%, 93% or 95% sequence        identity to SEQ ID NO:84 or the catalytic domain thereof (amino        acids 2-376 of SEQ ID NO:84) and further comprises (i) SEQ ID        NO:212 or SEQ ID NO:213 and/or (ii) SEQ ID NO:214;    -   (8) (a) at least 97% or 98% sequence identity to SEQ ID NO:80 or        the catalytic domain thereof (amino acids 2-377 of SEQ ID NO:80)        and/or (b) at least 80%, 85%, 90%, 93% or 95% sequence identity        to SEQ ID NO:80 or the catalytic domain thereof (amino acids        2-377 of SEQ ID NO:80) and further comprises (i) SEQ ID NO:212        or SEQ ID NO:213 and/or (ii) SEQ ID NO:214;    -   (9) at least 93%, 95%, 97% or 98% sequence identity to SEQ ID        NO:54 or the catalytic domain thereof (amino acids 2-376 of SEQ        ID NO:54);    -   (10) at least 80%, 85%, 90%, 93%, 95%, 97% or 98% sequence        identity to SEQ ID NO:46 or the catalytic domain thereof (amino        acids 2-376 of SEQ ID NO:46), and optionally further comprises        SEQ ID NO:206 or SEQ ID NO:207;    -   (11) at least 90%, 93%, 95%, 97% or 98% sequence identity to SEQ        ID NO:16 or the catalytic domain thereof (amino acids 2-376 of        SEQ ID NO:16);    -   (12) at least 85%, 90%, 93%, 95%, 97% or 98% sequence identity        to SEQ ID NO:82 or the catalytic domain thereof (amino acids        2-375 of SEQ ID NO:82); and/or    -   (13) at least 90%, 93%, 95%, 97% or 98% sequence identity to SEQ        ID NO:32 or the catalytic domain thereof (amino acids 2-377 of        SEQ ID NO:32).

The XIs of the disclosure can be characterized in terms of theiractivity. In some embodiments, a XI of the disclosure has at least 1.3times the activity of the Orpinomyces sp. XI assigned Genbank AccessionNo. 169733248 (“Op-XI”) at pH 7.5, for example using the assay describedin any of Examples 4, 6 and 7. In certain specific embodiments, a XI ofthe disclosure has an activity ranging from 1.25 to 3.0 times, from 1.5to 3 times, from 1.5 to 2.25 times, or from 1.75 to 3 times the activityof Op-XI at pH 7.5.

The XIs of the disclosure can also be characterized in terms of theirtolerance to acidic environments (e.g., at a pH of 6.5 or 6). In someembodiments, a XI of the disclosure has at least 1.9 times the activityof the Op-XI at pH 6, for example using the assay described in Example7. In certain specific embodiments, a XI of the disclosure has anactivity ranging from 1.9 to 4.1 times, from 2.4 to 4.1 times, from 2.4to 3.9 times, or 2.4 to 4.1 times the activity of Op-XI at pH 6.

Tolerance to acidic environments can also be characterized as a ratio ofactivity at pH 6 to activity at pH 7.5 (“a pH 6 to pH 7.5 activityratio”), for example as measured using the assay of Example 7. In someembodiments, the pH 6 to pH 7.5 activity ratio is at least 0.5 or atleast 0.6. In various embodiments, the pH 6 to pH 7.5 activity ratio is0.5-0.9 or 0.6-0.9.

In another aspect, the disclosure is directed to a nucleic acid whichencodes a XI polypeptide of the disclosure. In various embodiments, thenucleic acid comprises a nucleotide sequence with at least 50%, 60%,70%, 75%, 80%, 85%, 90%, 93%, 95%, 96%, 98%, 99% or 100% sequenceidentity to the nucleotide sequence of any one of SEQ ID NOS:1, 3, 5, 7,9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43,45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79,81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 103, 105, 107, 109, 111,113, 115, 117, 119, 121, 123, 125, 127, 129, 131, 133, 135, 137, 139,141, 143, 145, 147, 149, 151, 153, 155, 157, 159, 161, 163, 165, 167,169, 171, 173, and 175, or the portion of any of the foregoing sequencesencoding a XI catalytic domain or dimerization domain.

The nucleic acids of the disclosure can be codon optimized, e.g., forexpression in eukaryotic organisms such as yeast or filamentous fungi.Exemplary codon optimized open reading frames for expression in S.cerevisiae are SEQ ID NO:238 (encoding a XI of SEQ ID NO:54), SEQ IDNO:239 (encoding a XI of SEQ ID NO:58), SEQ ID NO:244 (encoding a XI ofSEQ ID NO:78), SEQ ID NO:245 (encoding a XI of SEQ ID NO:96), SEQ IDNO:246 (encoding a XI of SEQ ID NO:38), SEQ ID NO:247 (encoding a XI ofSEQ ID NO:78), SEQ ID NO:248 (encoding a XI of SEQ ID NO:96), and SEQ IDNO:249 (encoding a XI of SEQ ID NO:38). In various embodiments, thedisclosure provides nucleic acids comprising nucleotide sequences havingat least 70%, at least 75%, at least 80%, at least 85%, at least 90%, atleast 93%, at least 95%, at least 96%, at least 98%, or at least 99%sequence identity, or having 100% sequence identity, to the nucleotidesequence of any one of SEQ ID NOs:238, 239, 244, 245, 246, 247, 248 and249, or the portion of any of the foregoing sequences encoding a XIcatalytic domain or dimerization domain.

In other aspects, the disclosure is directed to a vector comprising aXI-encoding nucleotide sequence, for example a vector having an originof replication and/or a promoter sequence operably linked to theXI-encoding nucleotide sequence. The promoter sequence can be one thatis operable in a eukaryotic cell, for example in a fungal cell. In someembodiments, the promoter is operable in yeast (e.g., S. cerevisiae) orfilamentous fungi.

In yet another aspect, the disclosure is directed to a recombinant cellcomprising a nucleic acid that encodes a XI polypeptide. Particularly,the cell is engineered to express any of the XI polypeptides describedherein. The recombinant cell may be of any species, and is preferably aeukaryotic cell, for example a yeast cell. Suitable genera of yeastinclude Saccharomyces, Kluyveromyces, Candida, Pichia,Schizosaccharomyces, Hansenula, Klockera, Schwanniomyces, Issatchenkiaand Yarrowia. In specific embodiments, the recombinant cell is a S.cerevisiae, S. bulderi, S. barnetti, S. exiguus, S. uvarum, S.diastaticus, K. lactis, I. orientalis, K. marxianusor K. fragilis.Suitable genera of filamentous fungi include Aspergillus, Penicillium,Rhizopus, Chrysosporium, Myceliophthora, Trichoderma, Humicola,Acremonium and Fusarium. In specific embodiments, the recombinant cellis an Aspergillus niger, Aspergillus oryzae, Trichoderma reesei,Penicillium chrysogenum, Myceliophthora thermophila, or Rhizopus oryzae.

The recombinant cell may also be mutagenized or engineered to includemodifications other than the recombinant expression of XI, particularlythose that make the cell more suited to utilize xylose in a fermentationpathway. Exemplary additional modifications create one, two, three,four, five or even more of the following phenotypes: (a) increase inxylose transport into the cell; (b) increase in aerobic growth rate onxylose; (c) increase in xylulose kinase activity; (d) increase in fluxthrough the pentose phosphate pathway into glycolysis, (e) decrease inaldose reductase activity, (f) decrease in sensitivity to cataboliterepression, (g) increase in tolerance to biofuels, e.g., ethanol, (h)increase tolerance to intermediate production (e.g., xylitol), (i)increase in temperature tolerance, (j) osmolarity of organic acids, and(k) a reduced production of byproducts.

Increases in activity can be achieved by increased expression levels,for example expression of a hexose or pentose (e.g., xylose)transporter, a xylulose kinase, a glycolytic enzyme, or an ethanologenicenzyme is increased. The increased expression levels are achieved byoverexpressing an endogenous protein or by expressing a heterologousprotein.

Other modifications to the recombinant cell that are part of thedisclosure are modifications that decrease the activity of genes orpathways in the recombinant cell. Preferably, the expression levels ofone, two, three or more of the genes for hexose kinase, MIG-1, MIG-2,XR, aldose reductase, and XDH are reduced. Reducing gene activity can beachieved by a targeted deletion or disruption of the gene (andoptionally reintroducing the gene under the control of a differentpromoter that drives lower levels of expression or inducibleexpression).

In yet other aspects, the disclosure is directed to methods of producingfermentation products, for example one or more of ethanol, butanol,diesel, lactic acid, 3-hydroxy-propionic acid, acrylic acid, aceticacid, succinic acid, citric acid, malic acid, fumaric acid, itaconicacid, an amino acid, 1,3-propane-diol, ethylene, glycerol, a β-lactamantibiotic and a cephalosporin. Typically, a cell that recombinantlyexpresses a XI of the disclosure is cultured in a xylose-containingmedium, for example a medium supplemented with a lignocellulosichydrolysate. The media may also contain glucose, arabinose, or othersugars, particularly those derived from lignocellulose. The media may beof any pH, particularly a pH between 3.0 and 9.0, preferably between 4.0and 8.0, more preferably between 5.0 and 8.0, even more preferablybetween 6.0 and 7.5. The culture may occur in any media where theculture is under anaerobic or aerobic conditions, preferably underanaerobic conditions for production of compounds mentioned above andaerobically for biomass/cellular production. Optionally, the methodsfurther comprise recovering the fermentation product produced by therecombinant cell.

3. BRIEF DESCRIPTION OF THE FIGURES

FIGS. 1A-1B are maps for the vector pMEV-ΔxylA (MEV3 xylA del) andPCR-BluntII-TOPO-xylA, respectively, used in the activity-based screenfor XIs.

FIG. 2 illustrates the experimental strategy for the two-step markerexchange approach.

FIG. 3 is a map of the vector p426PGK1 for expressing XI in yeaststrain, Saccharomyces cerevisiae CEN.PK2-1Ca (ATCC: MYA1108).

FIG. 4 shows the growth rates on xylose containing media of selectedclones expressed in yeast strain, Saccharomyces cerevisiae CEN.PK2-1Ca(ATCC: MYA1108).

FIGS. 5A-5D are maps for the vectors pYDAB-006, pYDURA01, pYDPt-005 andpYDAB-0006, respectively, all used in creating strains of industrial S.cerevisiae strain yBPA130 with a single genomic copy of select XIclones.

FIG. 6 is a map of vector YDAB008-rDNA for multiple XI integration intoS. cerevisiae strain yBPB007 and yBPB008.

FIGS. 7A-7D show monosaccharide (including xylose) utilization andethanol production by strains of industrial S. cerevisiae with multiplecopies of XI clones integrated into ribosomal DNA loci.

FIG. 8: Production of ethanol from glycolytic and pentose phosphate(“PPP”) pathways. Not all steps are shown. For example,glyceraldehyde-3-phosphate is converted to pyruvate via a series ofglycolytic steps: (1) glyceraldehyde-3-phosphate to3-phospho-D-glycerol-phosphate catalyzed by glyceraldehyde-3-phosphatedehydrogenase (TDH1-3); (2) 3-phospho-D-glycerol-phosphate to3-phosphoglycerate catalyzed by 3-phosphoglycerate kinase (PGK1); (3)3-phosphoglycerate to 2-phosphoglycerate catalyzed by phosphoglyceratemutase (GPM1); (4) 2-phosphoglycerate to phosphoenolpyruvate catalyzedby enolase (ENO1; ENO2); and (5) phosphoenolpyruvate to pyruvatecalatyzed by pyruvate kinase (PYK2; CDC19). Other abbreviations:DHAP=dihydroxy-acetone-phosphate; GPD=Glycerol-3-phosphatedehydrogenase; RHR2/HOR2=DL-glycerol-3-phosphatase; XI=xylose isomerase;GRE=xylose reductase/aldose reductase; XYL=xylitol dehydrogenase;XKS=xylulokinase; PDC=pyruvate decarboxylase; ADH=alcohol dehydrogenase;ALD=aldehyde dehydrogenase; HXK=hexokinase; PGI=phosphoglucoseisomerase; PFK=phosphofructokinase; FBA=aldolase; TPI=triosephosphateisomerase; ZWF=glucose-6 phosphate dehydrogenase;SOL=6-phosphogluconolactonase; GND=6-phosphogluconate dehydrogenase;RPE=D-ribulose-5-Phosphate 3-epimerase; RKI=ribose-5-phosphateketol-isomerase; TKL=transketolase; TAL=transaldolase. Heavy dashedarrows indicate reactions and corresponding enzymes that can be reducedor eliminated to increase xylose utilization, particularly in theproduction of ethanol, and heavy solid arrows indicate reactions andcorresponding enzymes that can be increased to increase xyloseutilization, particularly in the production of ethanol. The enzymesshown in FIG. 8 are encoded by S. cerevisiae genes. The S. cerevisiaegenes are used for exemplification purposes. Analogous enzymes andmodifications in other organisms are within the scope of the presentdisclosure.

4. DETAILED DESCRIPTION 4.1 Xylose Isomerase Polypeptides

A “xylose isomerase” or “XI” is an enzyme that catalyzes the directisomerisation of D-xylose into D-xylulose and/or vice versa. This classof enzymes is also known as D-xylose ketoisomerases. A xylose isomeraseherein may also be capable of catalyzing the conversion betweenD-glucose and D-fructose (and accordingly may therefore be referred toas a glucose isomerase).

A “XI polypeptide of the disclosure” or a “XI of the disclosure” is axylose isomerase having an amino acid sequence that is related to anyone of SEQ ID NOs:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28,30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64,66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100,102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124, 126, 128,130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156,158, 160, 162, 164, 166, 168, 170, 172, 174, or 176. In someembodiments, the xylose isomerase of the disclosure has an amino acidsequence that is at least about 70%, at least 80%, at least 90%, atleast 95%, at least 96%, at least 98%, or at least 99% sequence identitythereto, or to a catalytic or dimerization domain thereof. The xyloseisomerase of the disclosure can also have 100% sequence identity to oneof the foregoing sequences.

The disclosure provides isolated, synthetic or recombinant XIpolypeptides comprising an amino acid sequence having at least about80%, e.g., at least about 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%,90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or complete (100%)sequence identity to a polypeptide of SEQ ID NO:2, 4, 6, 8, 10, 12, 14,16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50,52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86,88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 110, 112, 114, 116,118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144,146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172,174, or 176, over a region of at least about 10, e.g., at least about15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100,125, 150, 175, 200, 225, 250, 275, 300, 325, or 350 residues, or overthe full length of the polypeptide, over the length of catalytic domain,or over the length of the dimerization domain.

The XI polypeptides of the disclosure can be encoded by a nucleic acidsequence having at least about 80%, about 85%, about 86%, about 87%,about 88%, about 89%, or about 90% sequence identity to 1, 3, 5, 7, 9,11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45,47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81,83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 103, 105, 107, 109, 111, 113,115, 117, 119, 121, 123, 125, 127, 129, 131, 133, 135, 137, 139, 141,143, 145, 147, 149, 151, 153, 155, 157, 159, 161, 163, 165, 167, 169,171, 173, or 175, or by a nucleic acid sequence capable of hybridizingunder high stringency conditions to a complement of SEQ ID NO:1, 3, 5,7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41,43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77,79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 103, 105, 107, 109,111, 113, 115, 117, 119, 121, 123, 125, 127, 129, 131, 133, 135, 137,139, 141, 143, 145, 147, 149, 151, 153, 155, 157, 159, 161, 163, 165,167, 169, 171, 173, or 175, or to a fragment thereof. Exemplary nucleicacids of the disclosure are described in Section 4.2 below.

In specific embodiments, a polypeptide of the disclosure comprises anamino acid sequence having:

-   -   (1) (a) at least 97% or 98% sequence identity to SEQ ID NO:78 or        the catalytic domain thereof (amino acids 2-377 of SEQ ID NO:78)        and/or (b) at least 80%, 85%, 90%, 93% or 95% sequence identity        to SEQ ID NO:78 or the catalytic domain thereof (amino acids        2-377 of SEQ ID NO:78) and further comprises (i) SEQ ID NO:212        or SEQ ID NO:213 and/or (ii) SEQ ID NO:214;    -   (2) (a) at least 95%, 97% or 98% sequence identity to SEQ ID        NO:96 or the catalytic domain thereof (amino acids 2-377 of SEQ        ID NO:96) and/or (b) at least 80%, 85%, 90%, 93% or 95% sequence        identity to SEQ ID NO:96 or the catalytic domain thereof (amino        acids 2-377 of SEQ ID NO:96) and further comprises (i) SEQ ID        NO:212 or SEQ ID NO:213 and/or (ii) SEQ ID NO:214;    -   (3) at least 80%, 85%, 90%, 93%, 95%, 97% or 98% sequence        identity to SEQ ID NO:38 or the catalytic domain thereof (amino        acids 2-374 of SEQ ID NO:38), and optionally further comprises        one, two, three, four or all five of (i) SEQ ID NO:206 or SEQ ID        NO:207; (ii) SEQ ID NO:208; (iii) SEQ ID NO:209; (iv) SEQ ID        NO:210; and (iv) SEQ ID NO:211;    -   (4) at least 80%, 85%, 90%, 93%, 95%, 97% or 98% sequence        identity to SEQ ID NO:2 or the catalytic domain thereof (amino        acids 2-374 of SEQ ID NO:2);    -   (5) at least 93%, 95%, 97% or 98% sequence identity to SEQ ID        NO:58 or the catalytic domain thereof (amino acids 2-376 of SEQ        ID NO:58),    -   (6) at least 80%, 85%, 90%, 93%, 95%, 97% or 98% sequence        identity to SEQ ID NO:42 or the catalytic domain thereof (amino        acids 2-375 of SEQ ID NO:42), and optionally further comprises        one, two or all three of (i) SEQ ID NO:206 or SEQ ID        NO:207; (ii) SEQ ID NO:210; and (iii) SEQ ID NO:211;    -   (7) (a) at least 97% or 98% sequence identity to SEQ ID NO:84 or        the catalytic domain thereof (amino acids 2-376 of SEQ ID        NO:84), and/or (b) at least 80%, 85%, 90%, 93% or 95% sequence        identity to SEQ ID NO:84 or the catalytic domain thereof (amino        acids 2-376 of SEQ ID NO:84) and further comprises (i) SEQ ID        NO:212 or SEQ ID NO:213 and/or (ii) SEQ ID NO:214;    -   (8) (a) at least 97% or 98% sequence identity to SEQ ID NO:80 or        the catalytic domain thereof (amino acids 2-377 of SEQ ID NO:80)        and/or (b) at least 80%, 85%, 90%, 93% or 95% sequence identity        to SEQ ID NO:80 or the catalytic domain thereof (amino acids        2-377 of SEQ ID NO:80) and further comprises (i) SEQ ID NO:212        or SEQ ID NO:213 and/or (ii) SEQ ID NO:214;    -   (9) at least 93%, 95%, 97% or 98% sequence identity to SEQ ID        NO:54 or the catalytic domain thereof (amino acids 2-376 of SEQ        ID NO:54);    -   (10) at least 80%, 85%, 90%, 93%, 95%, 97% or 98% sequence        identity to SEQ ID NO:46 or the catalytic domain thereof (amino        acids 2-376 of SEQ ID NO:46), and optionally further comprises        SEQ ID NO:206 or SEQ ID NO:207;    -   (11) at least 90%, 93%, 95%, 97% or 98% sequence identity to SEQ        ID NO:16 or the catalytic domain thereof (amino acids 2-376 of        SEQ ID NO:16);    -   (12) at least 85%, 90%, 93%, 95%, 97% or 98% sequence identity        to SEQ ID NO:82 or the catalytic domain thereof (amino acids        2-375 of SEQ ID NO:82); and/or    -   (13) at least 90%, 93%, 95%, 97% or 98% sequence identity to SEQ        ID NO:32 or the catalytic domain thereof (amino acids 2-377 of        SEQ ID NO:32).

An example of an algorithm that is suitable for determining sequencesimilarity is the BLAST algorithm, which is described in Altschul etal., 1990, J. Mol. Biol. 215:403-410. Software for performing BLASTanalyses is publicly available through the National Center forBiotechnology Information. This algorithm involves first identifyinghigh scoring sequence pairs (HSPs) by identifying short words of lengthW in the query sequence that either match or satisfy somepositive-valued threshold score T when aligned with a word of the samelength in a database sequence. These initial neighborhood word hits actas starting points to find longer HSPs containing them. The word hitsare expanded in both directions along each of the two sequences beingcompared for as far as the cumulative alignment score can be increased.Extension of the word hits is stopped when: the cumulative alignmentscore falls off by the quantity X from a maximum achieved value; thecumulative score goes to zero or below; or the end of either sequence isreached. The BLAST algorithm parameters W, T, and X determine thesensitivity and speed of the alignment. The BLAST program uses asdefaults a word length (W) of 11, the BLOSUM62 scoring matrix (seeHenikoff & Henikoff, 1992, Proc. Nat'l. Acad. Sci. USA 89:10915-10919)alignments (B) of 50, expectation (E) of 10, M′5, N′-4, and a comparisonof both strands.

Any of the amino acid sequences described herein can be producedtogether or in conjunction with at least 1, e.g., at least (or up to) 2,3, 5, 10, or 20 heterologous amino acids flanking each of the C- and/orN-terminal ends of the specified amino acid sequence, and or deletionsof at least 1, e.g., at least (or up to) 2, 3, 5, 10, or 20 amino acidsfrom the C- and/or N-terminal ends of a XI of the disclosure.

The XIs of the disclosure can be characterized in terms of theiractivity. In some embodiments, a XI of the disclosure has at least 1.3times the activity of the Orpinomyces sp. XI assigned Genbank AccessionNo. 169733248 (“Op-XI”) at pH 7.5, for example using the assay describedin any of Examples 4, 6 and 7. In certain specific embodiments, a XI ofthe disclosure has an activity ranging from 1.25 to 3.0 times, from 1.5to 3 times, from 1.5 to 2.25 times, or from 1.75 to 3 times the activityof Op-XI at pH 7.5.

The XIs of the disclosure can also be characterized in terms of theirtolerance to acidic environments (e.g., at a pH of 6.5 or 6). In someembodiments, a XI of the disclosure has at least 1.9 times the activityof the Op-XI at pH 6, for example using the assay described in Example7. In certain specific embodiments, a XI of the disclosure has anactivity ranging from 1.9 to 4.1 times, from 2.4 to 4.1 times, from 2.4to 3.9 times, or 2.4 to 4.1 times the activity of Op-XI at pH6.

Tolerance to acidic environments can also be characterized as a ratio ofactivity at pH 6 to activity at pH 7.5 (“a pH 6 to pH 7.5 activityratio”), for example as measured using the assay of Example 7. In someembodiments, the pH 6 to pH 7.5 activity ratio is at least 0.5 or atleast 0.6. In various embodiments, the pH 6 to pH 7.5 activity ratio is0.5-0.9 or 0.6-0.9.

The xylose isomerases of the disclosure can have one or more (e.g., upto 2, 3, 5, 10, or 20) conservative amino acid substitutions relative tothe polypeptide of SEQ ID NO:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24,26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60,62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96,98, 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124,126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152,154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, or 176 or to theportion thereof of discussed above. The conservative substitutions canbe chosen from among a group having a similar side chain to thereference amino acid. For example, a group of amino acids havingaliphatic side chains is glycine, alanine, valine, leucine, andisoleucine; a group of amino acids having aliphatic-hydroxyl side chainsis serine and threonine; a group of amino acids having amide-containingside chains is asparagine and glutamine; a group of amino acids havingaromatic side chains is phenylalanine, tyrosine, and tryptophan; a groupof amino acids having basic side chains is lysine, arginine, andhistidine; and a group of amino acids having sulphur-containing sidechains is cysteine and methionine. Accordingly, exemplary conservativesubstitutions for each of the naturally occurring amino acids are asfollows: ala to ser; arg to lys; asn to gln or his; asp to glu; cys toser or ala; gln to asn; glu to asp; gly to pro; his to asn or gin; ileto leu or val; leu to ile or val; lys to arg; gln or glu; met to leu orile; phe to met, leu or tyr; ser to thr; thr to ser; trp to tyr; tyr totrp or phe; and, val to ile or leu.

The present disclosure also provides a fusion protein that includes atleast a portion (e.g., a fragment or domain) of a XI polypeptide of thedisclosure attached to one or more fusion segments, which are typicallyheterologous to the XI polypeptide. Suitable fusion segments include,without limitation, segments that can provide other desirable biologicalactivity or facilitate purification of the XI polypeptide (e.g., byaffinity chromatography). Fusion segments can be joined to the amino orcarboxy terminus of a XI polypeptide. The fusion segments can besusceptible to cleavage.

4.2 Xylose Isomerase Nucleic Acids

A “XI nucleic acid of the disclosure” is a nucleic acid encoding axylose isomerase of the disclosure. In certain embodiments, the xyloseisomerase nucleic acid of the disclosure is encoded by a nucleotidesequence of any one of SEQ ID NOs:1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21,23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57,59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93,95, 97, 99, 101, 103, 105, 107, 109, 111, 113, 115, 117, 119, 121, 123,125, 127, 129, 131, 133, 135, 137, 139, 141, 143, 145, 147, 149, 151,153, 155, 157, 159, 161, 163, 165, 167, 169, 171, 173, or 175, or asequence having at least about 50%, at least 60%, at least 70%, at least80%, at least 90%, at least 95%, at least 96%, at least 98%, or at least99% sequence identity thereto. The xylose isomerase nucleic acid of thedisclosure can also have 100% sequence identity to one of the foregoingsequences.

The present disclosure provides nucleic acids encoding a polypeptide ofthe disclosure, for example one described in Section 4.1 above. Thedisclosure provides isolated, synthetic or recombinant nucleic acidscomprising a nucleic acid sequence having at least about 70%, e.g., atleast about 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%,83%, 84%, 85%, 86%, 87%, 88%; 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%,97%, 98%, or 99%, or complete (100%) sequence identity to a nucleic acidof SEQ ID NO:1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31,33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67,69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101,103, 105, 107, 109, 111, 113, 115, 117, 119, 121, 123, 125, 127, 129,131, 133, 135, 137, 139, 141, 143, 145, 147, 149, 151, 153, 155, 157,159, 161, 163, 165, 167, 169, 171, 173, or 175, over a region of atleast about 10, e.g., at least about 15, 20, 25, 30, 35, 40, 45, 50, 75,100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750,800, 850, 900, 950, 1000, 1050, 1100, 1150, 1200, 1250, 1300, 1350,1400, 1450, 1500, 1550, 1600, 1650, 1700, 1750, 1800, 1850, 1900, 1950,or 2000 nucleotides.

Nucleic acids of the disclosure also include isolated, synthetic orrecombinant nucleic acids encoding a XI polypeptide having the sequenceof SEQ ID NO:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32,34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68,70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102,104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124, 126, 128, 130,132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158,160, 162, 164, 166, 168, 170, 172, 174, or 176, and subsequences thereof(e.g., a conserved domain or a catalytic domain), and variants thereof.

To increase the likelihood that a XI polypeptide is recombinantlyexpressed, a XI nucleic acid may be adapted to optimize its codon usageto that of the chosen cell. Several methods for codon optimization areknown in the art. For expression in yeast, an exemplary method tooptimize codon usage of the nucleotide sequences to that of the yeast isa codon pair optimization technology as disclosed in WO 2006/077258and/or WO 2008/000632. WO2008/000632 addresses codon-pair optimization.Codon-pair optimization is a method wherein the nucleotide sequencesencoding a polypeptide are modified with respect to their codon-usage,in particular the codon-pairs that are used, to obtain improvedexpression of the nucleotide sequence encoding the polypeptide and/orimproved production of the encoded polypeptide. Codon pairs are definedas a set of two subsequent triplets (codons) in a coding sequence. Bolescodon optimization (see Table 2 of Wiedemann and Boles, 2008, Appl.Environ. Microbiol. 74:2043-2050) can also be used to optimizeexpression and activity of XIs in yeast. Alternatively, the XI sequencecan be optimized using commercially available software, such as GeneDesigner (DNA2.0). Preferably, codon optimized sequences avoidnucleotide repeats and restriction sites that are utilized in cloningthe XI nucleic acids, by adjusting the settings in commercial softwareor by manually altering the sequences to substitute codons thatintroduce undesired sequences, for example with highly utilized codonsin the organism of interest. Exemplary codon optimized open readingframes for expression in S. cerevisiae are SEQ ID NO:238 (encoding a XIof SEQ ID NO:54), SEQ ID NO:239 (encoding a XI of SEQ ID NO:58), SEQ IDNO:244 (encoding a XI of SEQ ID NO:78), SEQ ID NO:245 (encoding a XI ofSEQ ID NO:96), SEQ ID NO:246 (encoding a XI of SEQ ID NO:38), SEQ IDNO:247 (encoding a XI of SEQ ID NO:78), SEQ ID NO:248 (encoding a XI ofSEQ ID NO:96), and SEQ ID NO:249 (encoding a XI of SEQ ID NO:38). Invarious embodiments, the disclosure provides nucleic acids comprisingnucleotide sequences having at least 70%, at least 75%, at least 80%, atleast 85%, at least 90%, at least 93%, at least 95%, at least 96%, atleast 98%, or at least 99% sequence identity, or having 100% sequenceidentity, to the nucleotide sequence of any one of SEQ ID NOs:238, 239,244, 245, 246, 247, 248 and 249, or the portion of any of the foregoingsequences encoding a XI catalytic domain or dimerization domain.

4.3 Host Cells and Recombinant Expression

The disclosure also provides host cells transformed with a XI nucleicacid and recombinant host cells engineered to express XI polypeptides.The XI nucleic acid construct may be extrachromosomal, on a plasmid,which can be a low copy plasmid or a high copy plasmid. The nucleic acidconstruct may be maintained episomally and thus comprise a sequence forautonomous replication, such as an autosomal replication sequence.Alternatively, a XI nucleic acid may be integrated in one or more copiesinto the genome of the cell. Integration into the cell's genome mayoccur at random by non-homologous recombination but preferably, thenucleic acid construct may be integrated into the cell's genome byhomologous recombination as is well known in the art. In certainembodiments, the host cell is bacterial or fungal (e.g., a yeast or afilamentous fungus).

Suitable host cells of the bacterial genera include, but are not limitedto, cells of Escherichia, Bacillus, Lactobacillus, Pseudomonas, andStreptomyces. Suitable cells of bacterial species include, but are notlimited to, cells of Escherichia coli, Bacillus subtilis, Bacilluslicheniformis, Lactobacillus brevis, Pseudomonas aeruginosa, andStreptomyces lividans.

Suitable host cells of the genera of yeast include, but are not limitedto, cells of Saccharomyces, Kluyveromyces, Candida, Pichia,Schizosaccharomyces, Hansenula, Klockera, Schwanniomyces, Phaffia,Issatchenkia and Yarrowia. In specific embodiments, the recombinant cellis a S. cerevisiae, C. albicans, S. pombe, S. bulderi, S. barnetti, S.exiguus, S. uvarum, S. diastaticus, H. polymorpha, K. lactis, I.orientalis, K. marxianus, K. fragilis, P. pastoris, P. canadensis, K.marxianus or P. rhodozyma. Exemplary yeast strains that are suitable forrecombinant XI expression include, but are not limited to, LallemandLYCC 6391, Lallemand LYCC 6939, Lallemand LYCC 6469, (all fromLallemand, Inc., Montreal, Canada); NRRL YB-1952 (ARS (NRRL) Collection,U.S. Department of Agriculture); and BY4741.

Suitable host cells of filamentous fungi include all filamentous formsof the subdivision Eumycotina. Suitable cells of filamentous fungalgenera include, but are not limited to, cells of Acremonium,Aspergillus, Aureobasidium, Bjerkandera, Ceriporiopsis, Chrysoporium,Coprinus, Coriolus, Corynascus, Chaetomium, Cryptococcus, Filobasidium,Fusarium, Gibberella, Humicola, Hypocrea, Magnaporthe, Mucor,Myceliophthora, Neocallimastix, Neurospora, Paecilomyces, Penicillium,Phanerochaete, Phlebia, Piromyces, Pleurotus, Scytaldium, Schizophyllum,Sporotrichum, Talaromyces, Thermoascus, Thielavia, Tolypocladium,Trametes, and Trichoderma. In certain aspects, the recombinant cell is aTrichoderma sp. (e.g., Trichoderma reesei), Penicillium sp., Humicolasp. (e.g., Humicola insolens); Aspergillus sp. (e.g., Aspergillusniger), Chrysosporium sp., Fusarium sp., or Hypocrea sp. Suitable cellscan also include cells of various anamorph and teleomorph forms of thesefilamentous fungal genera.

Suitable cells of filamentous fungal species include, but are notlimited to, cells of Aspergillus awamori, Aspergillus fumigatus,Aspergillus foetidus, Aspergillus japonicus, Aspergillus nidulans,Aspergillus niger, Aspergillus oryzae, Chrysosporium lucknowense,Fusarium bactridioides, Fusarium cerealis, Fusarium crookwellense,Fusarium culmorum, Fusarium graminearum, Fusarium graminum, Fusariumheterosporum, Fusarium negundi, Fusarium oxysporum, Fusariumreticulatum, Fusarium roseum, Fusarium sambucinum, Fusarium sarcochroum,Fusarium sporotrichioides, Fusarium sulphureum, Fusarium torulosum,Fusarium trichothecioides, Fusarium venenatum, Bjerkandera adusta,Ceriporiopsis aneirina, Ceriporiopsis aneirina, Ceriporiopsis caregiea,Ceriporiopsis gilvescens, Ceriporiopsis pannocinta, Ceriporiopsisrivulosa, Ceriporiopsis subrufa, Ceriporiopsis subvermispora, Coprinuscinereus, Coriolus hirsutus, Humicola insolens, Humicola lanuginosa,Mucor miehei, Myceliophthora thermophila, Neurospora crassa, Neurosporaintermedia, Penicillium purpurogenum, Penicillium canescens, Penicilliumsolitum, Penicillium funiculosum, Phanerochaete chrysosporium, Phlebiaradiate, Pleurotus eryngii, Talaromyces flavus, Thielavia terrestris,Trametes villosa, Trametes versicolor, Trichoderma harzianum,Trichoderma koningii, Trichoderma longibrachiatum, Trichoderma reesei,and Trichoderma viride.

Typically, for recombinant expression, the XI nucleic acid will beoperably linked to one or more nucleic acid sequences capable ofproviding for or aiding the transcription and/or translation of the XIsequence, for example a promoter operable in the organism in which theXI is to be expressed. The promoters can be homologous or heterologous,and constitutive or inducible.

Preferably, the XI polypeptide is expressed in the cytosol and thereforelacks a mitochondrial or peroxisomal targeting signal.

Where recombinant expression in a filamentous fungal host is desired,the promoter can be a fungal promoter (including but not limited to afilamentous fungal promoter), a promoter operable in plant cells, apromoter operable in mammalian cells.

As described in U.S. provisional application No. 61/553,901, filed Oct.31, 2011, the contents of which are hereby incorporated in theirentireties, promoters that are constitutively active in mammalian cells(which can derived from a mammalian genome or the genome of a mammalianvirus) are capable of eliciting high expression levels in filamentousfungi such as Trichoderma reesei. An exemplary promoter is thecytomegalovirus (“CMV”) promoter.

As described in U.S. provisional application No. 61/553,897, filed Oct.31, 2011, the contents of which are hereby incorporated in theirentireties, promoters that are constitutively active in plant cells(which can derived from a plant genome or the genome of a plant virus)are capable of eliciting high expression levels in filamentous fungisuch as Trichoderma reesei. Exemplary promoters are the cauliflowermosaic virus (“CaMV”) 35S promoter or the Commelina yellow mottle virus(“CoYMV”) promoter.

Mammalian, mammalian viral, plant and plant viral promoters can driveparticularly high expression when the associated 5′ UTR sequence (i.e.,the sequence which begins at the transcription start site and ends onenucleotide (nt) before the start codon), normally associated with themammalian or mammalian viral promoter is replaced by a fungal 5′ UTRsequence.

The source of the 5′ UTR can vary provided it is operable in thefilamentous fungal cell. In various embodiments, the 5′ UTR can bederived from a yeast gene or a filamentous fungal gene. The 5′ UTR canbe from the same species, one other component in the expression cassette(e.g., the promoter or the XI coding sequence), or from a differentspecies. The 5′ UTR can be from the same species as the filamentousfungal cell that the expression construct is intended to operate in. Inan exemplary embodiment, the 5′ UTR comprises a sequence correspondingto a fragment of a 5′ UTR from a T. reesei glyceraldehyde-3-phosphatedehydrogenase (gpd). In a specific embodiment, the 5′ UTR is notnaturally associated with the CMV promoter

Examples of other promoters that can be used include, but are notlimited to, a cellulase promoter, a xylanase promoter, the 1818 promoter(previously identified as a highly expressed protein by EST mappingTrichoderma). For example, the promoter can suitably be acellobiohydrolase, endoglucanase, or β-glucosidase promoter. Aparticularly suitable promoter can be, for example, a T. reeseicellobiohydrolase, endoglucanase, or β-glucosidase promoter.Non-limiting examples of promoters include a cbh1, cbh2, egl1, egl2,egl3, egl4, egl5, pki1, gpd1, xyn1, or xyn2 promoter.

For recombinant expression in yeast, suitable promoters for S.cerevisiae include the MFα1 promoter, galactose inducible promoters suchas the GAL1, GAL7 and GAL10 promoters, glycolytic enzyme promotersincluding the TPI and PGK promoters, the TDH3 promoter, the TEF1promoter, the TRP1 promoter, the CYCI promoter, the CUP1 promoter, thePHO5 promoter, the ADH1 promoter, and the HSP promoter. Promoters thatare active at different stage of growth or production (e.g., idiophaseor trophophase) can also be used (see, e.g., Puig et al., 1996,Biotechnology Letters 18(8):887-892; Puig and Pérez-Ortin, 2000,Systematic and Applied Microbiology 23(2): 300-303; Simon et al., 2001,Cell 106:697-708; Wittenberg and Reed, 2005, Oncogene 24:2746-2755). Asuitable promoter in the genus Pichia sp. is the AOXI (methanolutilization) promoter.

The engineered host cells can be cultured in conventional nutrient mediamodified as appropriate for activating promoters, selectingtransformants, or amplifying the nucleic acid sequence encoding the XIpolypeptide. Culture conditions, such as temperature, pH and the like,are those previously used with the host cell selected for expression,and will be apparent to those skilled in the art. As noted, manyreferences are available for the culture and production of many cells,including cells of bacterial and fungal origin. Cell culture media ingeneral are set forth in Atlas and Parks (eds.), 1993, The Handbook ofMicrobiological Media, CRC Press, Boca Raton, Fla., which isincorporated herein by reference. For recombinant expression infilamentous fungal cells, the cells are cultured in a standard mediumcontaining physiological salts and nutrients, such as described inPourquie et al., 1988, Biochemistry and Genetics of CelluloseDegradation, eds. Aubert, et al., Academic Press, pp. 71-86; and Ilmenet al., 1997, Appl. Environ. Microbiol. 63:1298-1306. Culture conditionsare also standard, e.g., cultures are incubated at 30° C. in shakercultures or fermenters until desired levels of XI expression areachieved. Preferred culture conditions for a given filamentous fungusmay be found in the scientific literature and/or from the source of thefungi such as the American Type Culture Collection (ATCC). After fungalgrowth has been established, the cells are exposed to conditionseffective to cause or permit the expression of a XI.

In cases where a XI coding sequence is under the control of an induciblepromoter, the inducing agent, e.g., a sugar, metal salt or antibiotics,is added to the medium at a concentration effective to induce XIexpression.

In addition to recombinant expression of a XI polypeptide, a host cellof the disclosure may further include one or more genetic modificationsthat increase the cell's ability to utilize xylose as a substrate in afermentation process. Exemplary additional modifications create one,two, three, four, five or even more of the following phenotypes: (a)increase in xylose transport into the cell; (b) increase in aerobicgrowth rate on xylose; (c) increase in xylulose kinase activity; (d)increase in flux through the pentose phosphate pathway into glycolysis,(e) modulating in aldose reductase activity, (f) decrease in sensitivityto catabolite repression, (g) increase in tolerance to biofuels, e.g.,ethanol, (h) increase tolerance to intermediate production (for examplexylitol), (i) increase in temperature tolerance, (j) osmolarity oforganic acids, and (k) a reduced production of byproducts.

As illustrated below, a modification that results in one or more of theforegoing phenotypes can be a result of increasing or decreasingexpression of an endogenous protein (e.g., by at least a factor of about1.1, about 1.2, about 1.5, about 2, about 5, about 10 or about 20) or aresult of introducing expression of a heterologous polypeptide. Foravoidance of doubt, “decreasing” or “reducing” gene expressionencompasses eliminating expression. Decreasing (or reducing) theexpression of an endogenous protein can be accomplished by inactivatingone or more (or all) endogenous copies of a gene in a cell. A gene canbe inactivated by deletion of at least part of the gene or by disruptionof the gene. This can be achieved by deleting the some or all of a genecoding sequence or regulatory sequence whose deletion results in areduction of gene expression in the cell. Examples of modifications thatincrease xylose utilization or yield of fermentation product aredescribed below.

Increasing Xylose Transport:

Xylose transport can be increased directly or indirectly. For example, arecombinant cell may include one or more genetic modifications thatresult in expression of a xylose transporter. Exemplary transportersinclude, but are not limited to GXF1, SUT1 and At6g59250 from Candidaintermedia, Pichia stipitis (now renamed Scheffersomyces stipitis; theterms are used interchangeably herein) and Arabidopsis thaliana,respectively (Runquist et al., 2010, Biotechnol. Biofuels 3:5), as wellas HXT4, HXT5, HXT7, GAL2, AGT1, and GXF2 (see, e.g., Matsushika et al.,2009, Appl. Microbiol. Biotechnol. 84:37-53). Other transporters includePsAraT, SUT2-4 and XUT1-5 from P. stiptis; GXS1 from Candida intermedia;XylHP and DEHAOD02167 from Debaryomyces hansenii; and YALI0C06424 fromYarrowia lipolytica (see, e.g., Young et al., 2011, Appl. Environ.Microbiol. 77:3311-3319). Xylose transport can also be increased by(over-) expression of low-affinity hexose transporters, which arecapable of non-selectively transporting sugars, including xylose, intothe cell once glucose levels are low (e.g., 0.2-1.0 g/1); and includesCgHXT1-CgHXT5 from Colletotrichum graminicola. The foregoingmodifications can be made singly or in combinations of two, three ormore modifications.

Increasing Xylulose Kinase Activity:

Xylulose kinase activity can be increased by overexpression of axylulose kinase, e.g., xylulose kinase (XKS1; Saccharomyces genomedatabase (“SGD”) accession no. YGR194C) of S. cerevisiae, particularlywhere the recombinant cell is a yeast cell. In one embodiment, a S.cerevisiae cell is engineered to include at least 2 additional copies ofxylulose kinase under the control of a strong constitutive promoter suchas TDH3, TEF1 or PGK1. In another embodiment, overexpression of anendogenous xylulose kinase was engineered. This xylulose kinase havingimproved kinetic activities through the use of protein engineeringtechniques known by those skilled in the art.

Increasing Flux Through the Pentose Phosphate Pathway:

This can be achieved by increasing expression of one or more genes inthe pentose phosphate pathway, for example S. cerevisiae transaldolaseTAL1 (SGD accession no. YLR354C), transketolase TKL1 (SGD accession no.YPR074C), ribulose 5-phosphate epimerase RPE1 (SGD accession no.YJL121C) and ribose-5-phosphate ketoisomerase RKI1 (SGD accession no.YOR095C) and/or one or more genes to increase glycolytic flux, forexample S. cerevisiae pyruvate kinase PYK1/CDC19 (SGD accession no.YAL038W), pyruvate decarboxylase PDC1 (SGD accession no. YLR044C),pyruvate decarboxylase PDC5 (SGD accession no. YLR134W), pyruvatedecarboxylase PDC6 (SGD accession no. YGR087C), the alcoholdehydrogenases ADH1-5 (SGD accession nos. YOL086C, YMR303C, YMR083W,YGL256W, and YBR145W, respectively), and hexose kinase HXK1-2 (SGDaccession nos. YFR053C and YGL253W, respectively). In one embodiment,the yeast cell has one additional copy each of TAL1, TKL1, RPE1 and RKI1from S. cerevisiae under the control of strong constitutive promoters(e.g., PGK1, TDH3, TEF1); and may also include improvements toglycolytic flux (e.g., increased copies of genes such as PYK1, PDC1,PDC5, PDC6, ADH1-5) and glucose-6-phosphate and hexokinase. Theforegoing modifications can be made singly or in combinations of two,three or more modifications.

Modulating Aldose Reductase Activity:

A recombinant cell can include one or more genetic modifications thatincrease or reduce (unspecific) aldose reductase (sometimes calledaldo-keto reductase) activity. Aldose reductase activity can be reducedby one or more genetic modifications that reduce the expression of orinactivate a gene encoding an aldose reductase, for example S.cerevisiae GRE3 (SGD accession no. YHR104W).

In certain embodiments, GRE3 expression is reduced. In one aspect, therecombinant cell is a yeast cell in which the GRE3 gene is deleted.Deletion of GRE3 decreased xylitol yield by 49% and biomass productionby 31%, but increased ethanol yield by 19% (Traff-Bjerre et al., 2004,Yeast 21:141-150). In another aspect, the recombinant cell is a yeastcell which has a reduction in expression of GRE3. Reducing GRE3expression has been shown to result in a two-fold decrease in by-product(i.e., xylitol) formation and an associated improvement in ethanol yield(Traff et al., 2001, Appl. Environ. Microbiol. 67:5668-5674).

In another embodiment, the recombinant cell is a cell (optionally butnot necessarily a yeast cell) in which GRE3 is overexpressed. In a studyanalyzing the effect of GRE3 overexpression in S. cerevisiae toinvestigate the effect on xylose utilization, an increase of about 30%in xylose consumption and about 120% in ethanol production was noted(Traff-Bjerre et al., 2004, Yeast 21:141-150).

Decreasing Xylose Reductase Activity:

A recombinant cell may include one or more genetic modifications thatreduce xylose reductase activity. Xylose reductase activity can bereduced by one or more genetic modifications that reduce the expressionof or inactivate a gene encoding a xylose reductase.

Decreasing Sensitivity to Catabolite Repression:

Glucose and other sugars, such as galactose or maltose, are able tocause carbon catabolite repression in Crabtree-positive yeast, such asS. cerevisiae. In one study, xylose was found to decrease thederepression of various enzymes of an engineered S. cerevisiae straincapable of xylose utilization by at least 10-fold in the presence ofethanol. Xylose also impaired the derepression of galactokinase andinvertase (Belinchon & Gancedo, 2003, Arch. Microbiol. 180:293-297). Incertain embodiments, in order to reduce catabolite sensitivity, yeastcan include one or more genetic modifications that reduce expression ofone or more of GRR1 (SGD accession no. YJR090C), the gene assigned SGDaccession no. YLR042C, GAT1 (SGD accession no. YKR067W) and/or one ormore genetic modifications that decrease expression of one or more ofSNF1 (SGD accession no. YDR477W), SNF4 (SGD accession no. YGL115W), MIG1(SGD accession no. YGL035C) and CRE1 (SGD accession no. YJL127C). Infurther embodiments, yeast can include one or more genetic modificationsthat result in overexpression of the pentose phosphate pathway enzymes.In yet further embodiments, yeast can include one or more geneticmodifications that reduce expression of hexo-/glucokinase. In yet afurther embodiment, yeast can include one or more genetic modificationsthat modulate the activity of one or more GATA factors, for exampleGAT1, DAL80 (SGD accession no. YKR034W), GZF3 (SGD accession no.YJL110C) and GLN3 (SGD accession no. YER040W). The foregoingmodifications can be made singly or in combinations of two, three ormore modifications.

Increasing Tolerance to Biofuels (e.g., Ethanol), Pathway Intermediates(e.g., Xylitol), Organic Acids and Temperature:

For efficient bioethanol production from lignocellulosic biomass, it isuseful to improve cellular tolerance to toxic compounds released duringthe pretreatment of biomass. In one study, the gene encoding PHO13 (SGDaccession no. YDL236W), a protein with alkaline phosphatase activity,was disrupted. This resulted in improved ethanol production from xylosein the presence of three major inhibitors (i.e., acetic acid, formicacid and furfural). Further, the specific ethanol productivity of themutant in the presence of 90 mM furfural was four fold higher (Fujitomiet al., 2012, Biores. Tech., 111:161-166). Thus, in one embodiment,yeast has one or more genetic modifications that reduce PHO13expression. In other embodiments, yeast, bacterial and fungal cells areevolved under selective conditions to identify strains that canwithstand higher temperatures, higher levels of intermediates, higherlevels of organic acids and/or higher levels of biofuels (e.g.,ethanol). In yet other embodiments, yeast are engineered to reduceexpression of FPS1 (SGD accession no. YLL043W); overexpress unsaturatedlipid and ergosterol biosynthetic pathways; reduce expression of PHO13and/or SSK2 (SGD accession no. YNR031C); modulate global transcriptionfactor cAMP receptor protein, through increasing or decreasingexpression; increase expression of MSN2 (SGD accession no. YMR037C),RCN1 (SGD accession no. YKL159C), RSA3 (SGD accession no. YLR221C),CDC19 and/or ADH1; or increase expression of Rice ASR1. The foregoingmodifications can be made singly or in combinations of two, three ormore modifications.

Reducing Production of Byproducts:

Glycerol is one of the main byproducts in C6 ethanol production.Reducing glycerol is desirable for increasing xylose utilization byyeast. Production of glycerol can be reduced by deleting the geneencoding the FPS1 channel protein, which mediates glycerol export, andGPD2 (SGD accession no. YOL059W), which encodes glycerol-3-phosphatedehydrogenase; optionally along with overexpression of GLT1 (SGDaccession no. YDL171C) and GLN1 (SGD accession no. YPR035W). In onestudy, FPS1 and GPD2 were knocked-out in one S. cerevisiae strain, andin another were replaced by overexpression of GLT1 and GLN1, whichencode glutamate synthase and glutamine synthetase, respectively. Whengrown under microaerobic conditions, these strains showed ethanol yieldimprovements of 13.17% and 6.66%, respectively. Conversely, glycerol,acetic acid and pyruvic acid were found to all decrease, with glyceroldown 37.4% and 41.7%, respectively (Zhang and Chen, 2008, Chinese J.Chem. Eng. 16:620-625).

Production of glycerol can also be reduced by deleting theNADH-dependent glycerol-3-phosphate dehydrogenase 1 (GPD1; SGD accessionno. YDL022W) and/or the NADPH-dependent glutamate dehydrogenase 1 (GDH1;SGD accession no. YOR375C). Sole deletion of GPD1 or GDH1 reducesglycerol production, and double deletion results in a 46.4% reduction ofglycerol production as compared to wild-type S. cerevisiae (Kim et al.,2012, Bioproc. Biosys. Eng. 35:49-54). Deleting FPS1 can decreaseproduction of glycerol for osmoregulatory reasons.

Reducing production of acetate can also increase xylose utilization.Deleting ALD6 (SGD accession no. YPL061W) can decrease production ofacetate.

ADH2 can also be deleted to reduce or eliminate acetylaldehyde formationfrom ethanol and thereby increase ethanol yield.

The foregoing modifications to reduce byproduct formation can be madesingly or in combinations of two, three or more modifications.

In addition to ethanol production, a recombinant XI-expressing cell ofthe disclosure can be suitable for the production of non-ethanolicfermentation products. Such non-ethanolic fermentation products includein principle any bulk or fine chemical that is producible by aeukaryotic microorganism such as a yeast or a filamentous fungus. Suchfermentation products may be, for example, butanol, lactic acid,3-hydroxy-propionic acid, acrylic acid, acetic acid, succinic acid,citric acid, malic acid, fumaric acid, itaconic acid, an amino acid,1,3-propane-diol, ethylene, glycerol, a β-lactam antibiotic or acephalosporin. A preferred modified host cell of the disclosure forproduction of non-ethanolic fermentation products is a host cell thatcontains a genetic modification that results in decreased alcoholdehydrogenase activity.

Cells expressing the XI polypeptides of the disclosure can be grownunder batch, fed-batch or continuous fermentations conditions. Classicalbatch fermentation is a closed system, wherein the compositions of themedium is set at the beginning of the fermentation and is not subject toartificial alternations during the fermentation. A variation of thebatch system is a fed-batch fermentation in which the substrate is addedin increments as the fermentation progresses. Fed-batch systems areuseful when catabolite repression is likely to inhibit the metabolism ofthe cells and where it is desirable to have limited amounts of substratein the medium. Batch and fed-batch fermentations are common and wellknown in the art. Continuous fermentation is an open system where adefined fermentation medium is added continuously to a bioreactor and anequal amount of conditioned medium is removed simultaneously forprocessing. Continuous fermentation generally maintains the cultures ata constant high density where cells are primarily in log phase growth.Continuous fermentation systems strive to maintain steady state growthconditions. Methods for modulating nutrients and growth factors forcontinuous fermentation processes as well as techniques for maximizingthe rate of product formation are well known in the art of industrialmicrobiology.

4.4 Fermentation Methods

A further aspect the disclosure relates to fermentation processes inwhich the recombinant XI-expressing cells are used for the fermentationof carbon source comprising a source of xylose. Thus, in certainembodiments, the disclosure provides a process for producing afermentation product by (a) fermenting a medium containing a source ofxylose with a recombinant XI-expressing cell as defined herein above,under conditions in which the cell ferments xylose to the fermentationproduct, and optionally, (b) recovery of the fermentation product. Insome embodiments, the fermentation product is an alcohol (e.g., ethanol,butanol, etc.), a fatty alcohol (e.g., a C8-C20 fatty alcohol), a fattyacid (e.g., a C8-C20 fatty acid), lactic acid, 3-hydroxypropionic acid,acrylic acid, acetic acid, succinic acid, citric acid, malic acid,fumaric acid, an amino acid, 1,3-propanediol, itaconic acid, ethylene,glycerol, and a β-lactam antibiotic such as Penicillin G or Penicillin Vand fermentative derivatives thereof and cephalosporins. Thefermentation process may be an aerobic or an anaerobic fermentationprocess.

In addition to a source of xylose the carbon source in the fermentationmedium may also comprise a source of glucose. The source of xylose orglucose may be xylose or glucose as such or may be any carbohydrateoligo- or polymer comprising xylose or glucose units, such as e.g.,lignocellulose, xylans, cellulose, starch and the like. Mostmicroorganisms possess carbon catabolite repression that results insequential consumption of mixed sugars derived from the lignocellulose,reducing the efficacy of the overall process. To increase the efficiencyof fermentation, microorganisms that are capable of simultaneousconsumption of mixed sugars (e.g., glucose and xylose) have beendeveloped, for example by rendering them less sensitive to glucoserepression (see, e.g., Kim et al., 2010, Appl. Microbiol. Biotechnol.88:1077-85 and Ho et al., 1999, Adv. Biochem. Eng. Biotechnol.65:163-92). Such cells can be used for recombinant XI expression and inthe fermentation methods of the disclosure.

The fermentation process is preferably run at a temperature that isoptimal for the recombinant XI-expressing cells. Thus, for most yeastsor fungal host cells, the fermentation process is performed at atemperature which is less than 38° C., unless temperature tolerantmutant strains are used, in which case the temperature may be higher.For most yeast or filamentous fungal host cells, the fermentationprocess is suitably performed at a temperature which is lower than 35°C., 33° C., 30° C. or 28° C. Optionally, the temperature is higher than20° C., 22° C., or 25° C.

An exemplary process is a process for the production of ethanol, wherebythe process comprises the steps of: (a) fermenting a medium containing asource of xylose with a transformed host cell as defined above, wherebythe host cell ferments xylose to ethanol; and optionally, (b) recoveryof the ethanol. The fermentation medium can also comprise a source ofglucose that is also fermented to ethanol. The source of xylose can besugars produced from biomass or agricultural wastes. Many processes forthe production of monomeric sugars such as glucose generated fromlignocellulose are well known, and are suitable for use herein. Inbrief, the cellulolytic material may be enzymatically, chemically,and/or physically hydrolyzed to a glucose and xylose containingfraction. Alternatively, the recombinant XI-expressing cells of thedisclosure can be further transformed with one or more genes encodingfor enzymes effective for hydrolysis of complex substrates such aslignocellulose, and include but are not limited to cellulases,hemicellulases, peroxidases, laccases, chitinases, proteases, andpectinases. The recombinant cells of the disclosure can then befermented under anaerobic in the presence of glucose and xylose. Wherethe recombinant cell is a yeast cell, the fermentation techniques andconditions described for example, by Wyman (1994, Biores. Technol.50:3-16) and Olsson and Hahn-Hagerdal (1996, Enzyme Microb. Technol.18:312-331) can be used. After completion of the fermentation, theethanol may be recovered and optionally purified or distilled. Solidresidue containing lignin may be discarded or burned as a fuel.

The fermentation process may be run under aerobic and anaerobicconditions. In some embodiments, the process is carried out undermicroaerobic or oxygen limited conditions. Fermentation can be carriedout in a batch, fed-batch, or continuous configuration within(bio)reactors.

5. EXAMPLES 5.1 Materials and Methods

5.1.1 Yeast Culture

Unless stated otherwise for a particular example, yeast transformantswere grown in SC-ura media with about 2% glucose at 30° C. for about 24hours. The media contains approx. 20 g agar, approx. 134 g BD Difco™Yeast Nitrogen Base without amino acids (BD, Franklin Lakes, N.J., andapprox. 2 g SC amino-acid mix containing about 85 mg of the followingamino acids unless noted (quantity listed in parentheses): L-Adenine(21.0), L-Alanine, L-Arginine, L-Asparagine, L-Aspartic Acid,L-Cysteine, Glutamine, L-Glutamic Acid, Glycine, L-Histidine,Myo-Inositol, L-Isoleucine, L-Leucine (173.4), L-Lysine, L-Methionine,p-Aminobenzoic Acid (8.6), L-Phenylalanine, L-Proline, L-Serine,L-Threonine, L-Tryptophan, L-Tyrosine, L-Valine).

5.1.2 Xylose Isomerase Activity

XI activity in cell lysates was determined using a method based on thatof Kersters-Hilderson et al., 1986, Enzyme Microb. Technol. 9:145-148,in which enzymatic conversion of xylose to xylulose by the XI is coupledwith the enzymatic conversion of the product (xylulose) to xylitol viathe enzyme sorbitol dehydrogenase (SDH). SDH activity requires theoxidation of NADH to NAD⁺. The rate of oxidation of NADH is directlyproportional to the rate of SDH conversion of D-xylulose to D-xylitoland is measured by the decrease in absorbance at 340 nm One unit ofenzyme activity as measured by this assay is a decrease of 1 mole ofNADH per minute under assay conditions. All reactions, solutions,plates, and spectrophotometer were equilibrated to about 35° C. prior touse. Assays were performed either on fresh lysates immediately afterpreparation or lysates that had been frozen at −20° C. immediately afterpreparation. Assays were performed using a BioTek Model: Synergy H1Hybrid Reader spectrophotometer and 96-well plates (Corning, Model#Costar® #3598). All spectrophotometric readings were performed at 340nm. A standard curve of NADH was generated with each assay withconcentrations ranging from 0 to about 0.6 mM.

The reaction buffer used for experiments at pH 7.5 was about 100 mMTris-HCl (pH 7.5). The assay mix was prepared as follows: reactionbuffer to which was added about 10 mM MgCl₂, 0.15 mM NADH and 0.05 mg/mlSDH (Roche, catalog #50-720-3313). For experiments where activity wasalso measured at pH 6, the buffer was changed to about 100 mM sodiumphosphate, pH 6. The assay mix for the entire experiment was thenprepared as follows: about 10 mM MgCl₂, 1.2 mM NADH and 0.02 mg/ml SDH.

Any sample dilutions were performed using the reaction buffer asdiluent. Reactions were set up by aliquotting about 90 μl of assay mixinto each well of the plates. About 10 μl of each XI sample was added tothe wells. The reactions were started by the addition of about 100 μlsubstrate solution (about 1 M D-xylose). Reactions were mixed and readimmediately using kinetic assay mode for about 10 minutes. Volumetricactivity (VA) units are in milli-absorbance (mA) units per minute per mlof lysate added to the reactions (mA/min/ml). Background VA rates ofnegative control wells (no enzyme added) were subtracted from VA ofsamples. Determination of fold improvement over positive control (FIOPC)was obtained by dividing the VA of the XI-samples by the VA observed fora control (Orpinomyces xylose isomerase, NCBI:169733248 (Op-XI))expressed using the same host and expression vector. In somecharacterizations, the slope of an NADH standard curve was used toconvert VA (mA/min) to μmole-NADH/min (or Units). If proteinquantitation was performed, specific activities (SA) were calculatedwhere the units for SA are (mole NADH⁺/min/mg, or U/mg lysate protein).All activities listed (VA or SA) account for any dilutions, volumes oflysate added, and protein concentrations for the lysates assayed.

5.2 Example 2: Activity-Based Discovery Screen for Xylose Isomerases

Libraries used for the activity-based discovery (“ABD”) screen were inthe format of excised phagemids. These libraries were constructed asdescribed in U.S. Pat. No. 6,280,926. Sources for these libraries wereenvironmental rumen samples collected from the foregut of deceasedherbivores.

An Escherichia coli screening strain was constructed to identify genesfrom the environmental libraries encoding xylose isomerase activity.Specifically, E. coli strain SEL700, a MG1655 derivative that is recA⁻,phage lambda resistant and contains an F′ plasmid, was complemented withplasmid pJC859, a derivative of pBR322 containing the E. coli recA gene(Kokjohn et al., 1987, J. Bacteriol. 169:1499-1508) to generate awild-type recA phenotype.

A two-step marker exchange procedure was then used to delete the entirecoding sequence of the endogenous xy/A xylose isomerase gene. Briefly,pMEV3, a plasmid with a pir-dependent replicon (ori6RK) encodingkanamycin-resistance and the sacB levansucrase, was used as a vector forconstruction of the xylA deletion plasmid. A fragment of DNA containingthe flanking regions of the xylA gene (0.7 kb of sequence 5′ and 0.9 kbof sequence 3′ of xylA) and containing BsaI restriction sites wasgenerated by overlap extension PCR using primers, ligated to pMEV3digested with BbsI, and transformed into E. coli by electroporation.Clones were confirmed by sequencing, resulting in plasmid pMEV3-ΔxylA(FIG. 1A).

The pMEV3-ΔxylA plasmid was then transformed into strain E. coli strainSEL700 (MG1655 Δ^(r), Δ(recA-srl)306,srl-301::Tn10-84(Tets), [F′ proAB,lacI^(q), ZΔM15, Tn10 (Tet^(r))] pJC859). Single-crossover events wereselected for by plating on LB agar plates containing kanamycin (finalconcentration, about 50 μg/ml). After confirmation of integration ofpMEV3-ΔxylA on the chromosome, a second crossover event was selected forby growth on LB agar media containing sucrose (FIG. 2). Coloniesdisplaying resistance to kanamycin and the ability to grow on sucrosewere screened both by PCR characterization with primers flanking thexy/A gene to confirm gene deletion and by growth on a modified MacConkeymedia (ABD media), comprised of: MacConkey Agar Base (Difco™ #281810)(approximate formula per liter: Pancreatic Digest of Gelatin (17.0 g)Peptones (meat and casein) (3.0 g), Bile Salts No. 3 (1.5 g), SodiumChloride (5.0 g), Agar (13.5 g), Neutral Red (0.03 g), Crystal Violet(1.0 g, Xylose (30.0 g) and Kanamycin (50 mg). The ABD media containedneutral red, a pH indicator that turns red at a pH <6.8. Colonies ofmutants lacking xylA appeared white on this media while colonies withrestored xylose metabolism ability appeared red in color due to thefermentation of xylose to xylulose, which lowered the pH of the mediasurrounding those colonies.

Following the successful deletion of xylA, the resulting strain wascured of pJC859 by the following method: The xylA deletion strain wasgrown for about 24 hours in LB media containing tetracycline at a finalconcentration, about 20 μg/ml, at around 37° C. The next day the cellswere subcultured (1:100 dilution) into LB tetracycline (at the sameconcentration) media and incubated at about three different temperatures(30, 37, and 42° C.). Cells were passaged the same way as above forabout two more days. Dilutions of the resulting cultures were plated onLB plates to isolate single colonies. Colonies were replica plated ontoLB agar plates with and without Carbenicillin (at about 100 μg/ml, finalconcentration). Carbenicillin resistant colonies were deemed to stillcontain vector pJC859 whereas carbenicillin sensitive colonies werecured of pJC859, restoring the recA genotype of strain SEL700. Thisstrain, SEL700 ΔxylA, was used for the ABD screening.

The ABD screening method was verified by creating a positive controlstrain by PCR amplification of the xylA gene from E. coli K12 andcloning into the PCR-BluntII TOPO vector (Invitrogen, Carlsbad, Calif.)using standard procedures. This vector (PCR-BluntII-TOPO-xylA, FIG. 1B)was then transformed into the screening strain (SEL700 ΔxylA).Complementation of the xylose phenotype was verified by growth oftransformants on ABD media and appearance of red halos indicating xyloseutilization.

The libraries were screened for XI activity by infecting strain SEL700ΔxylA with the excised phagemid libraries. Infected cells were platedonto ABD media and only colonies with red “halos” (indicating xylosefermentation), were carried forward. Positives were purified to singlecolonies, and regrown on ABD media to confirm phenotype.

5.3 Example 2:Sequence-Based Discovery for Xylose Isomerases

Libraries used for sequence-based discovery (“SBD”) were in the formatof genomic DNA (gDNA) extractions. These libraries were constructed asdescribed in U.S. Pat. No. 6,280,926. Sources for these libraries weresamples collected from the guts of deceased herbivores.

XI genes often exist in conserved gene clusters (Dodd et al., 2011,Molecular Microbiol. 79:292-304). In order to obtain full length XI genesequences from metagenomic samples, primers were designed to bothupstream and downstream conserved DNA sequences found in severalBacteroides species, typically xylulose kinase and xylose permease,respectively. These flanking DNA sequences were obtained from publicdatabases. Sample genomic DNA was extracted from eleven different animalrumen samples. Left flanking consensus primer has the sequence5′-GCIGCICARGARGGNATYGTVTT-3′ (SEQ ID NO:177) (this primer codes for theamino acid motif AAQEGIV(F) (SEQ ID NO:178)). Right flanking consensusprimer has the sequence 5′-GCDATYTCNGCRATRTACATSGG-3′ (SEQ ID NO:179)(this primer codes for the amino acid motif PMYIAEIA (SEQ ID NO:180)).PCR reactions were carried out using touchdown cycling conditions, andhot start Platinum® Taq DNA polymerase (Invitrogen, Carlsbad, Calif.).PCR products of expected size were purified and subcloned into pCR4-TOPOvector system (Invitrogen, Carlsbad, Calif.). Positive colonies from theTOPO-based PCR libraries were transformed into TOP10 (Invitrogen,Carlsbad, Calif.) and the transformants grown on LB agar plates withkanamycin (about 25 μg/ml final concentration). Resistant colonies werepicked and inoculated into 2 columns each of a 96-deep well plate inabout 1.2 ml LB kanamycin (25 μg/ml final concentration) media per well.Cultures were grown overnight at about 30° C. The next day plasmids werepurified and inserts sequenced. Sequence analysis revealed multiple fulllength XI genes. Identification of putative ORFs was done by identifyingstart and stop codons for the longest protein coding region, andsubsequent manual curation based on homology to published xyloseisomerase DNA sequences.

5.4 Example 3: XI Sequence Analysis

Plasmids from both ABD and SBD screens were purified and vector insertswere sequenced using an ABI 3730xl DNA Analyzer and ABI BigDye® v3.1cycle sequencing chemistry. Identification of putative ORFs was done byidentifying start and stop codons for the longest protein coding region,and subsequent manual curation based on homology to published xyloseisomerase DNA sequences. The XI ORF identified are set forth in Table 2below, which indicates the sequences and source organism classificationfor each XI determined from either the ABD or SBD libraries as well astheir assigned sequence identifiers. The putative catalytic domains(based on sequence alignments with other XIs) are underlined.

TABLE 2 SEQ Clone Class of Type of ID No. organism Sequence NO: Sequence1754MI2_ Bacteroidales DNA   1ATGGCAGTTAAAGAATATTTCCCGGAGATAGGCAAGATCGCCTTTGAAGGAAAGGAGTCC 001AAGAACCCTATGGCATTCCACTACTACAATCCAGAGCAGGTAGTAGCCGGAAAGAAAATGAAAGATTGGTTCAAGTTCGCTATGGCATGGTGGCACACCCTCTGCGCTGAAGGTGGCGACCAGTTCGGTCCTGGTACCAAGAAATTCCCTTGGAACACAGGTGCAACTGCACTCGAAAGAGCAAAGAACAAAATGGACGCAGGTTTCGAGATCATGAGCAAGCTCGGTATCGAGTATTTCTGCTTCCACGATGTTGACCTTATCGACGAGGCTGACACTGTTGAAGAGTACGAGGCTAACATGAAGGCTATCACAGCTTACGCAAAGGAGAAAATGGCCGCTACTGGCATCAAACTCCTCTGGGGAACAGCCAATGTATTCGGCAACAAGAGATATATGAACGGCGCTTCTACCAACCCTGACTTCAACGTGGCTGCACGCGCTATGCTCCAGATCAAGAACGCTATCGACGCAACTATCGCTCTCGGTGGTGACTGCTATGTATTCTGGGGCGGCCGTGAGGGTTACATGAGCCTTCTCAACACCGATATGAAGAGAGAGAAAGAGCACATGGCTACCATGCTTACCATGGCACGCGACTATGCTCGTTCTAAGGGCTTCAAGGGTACCTTCCTTATCGAGCCTAAGCCAATGGAGCCGATGAAGCACCAGTACGATGTCGATACTGAGACTGTCGTAGGTTTCCTCCGCGCCCATGGTCTTGACAAGGACTTCAAGGTAAACATCGAGGTTAACCACGCTACTCTCGCAGGCCACACCTTCGAGCACGAGCTCCAGTGCGCCGTTGACGCAGGCATGCTCGGAAGCATCGACGCCAACCGTGGTGACTACCAGAACGGCTGGGATACCGACCAGTTCCCTATCGACCTCTATGAGCTCGTACAGGCTATGATGGTTATCATCAAGGGCGGCGGTCTCGTCGGCGGTACCAACTTCGACGCCAAGACCCGTCGTAACTCAACAGACCTCGAGGATATCTTCATCGCTCATGTATCCGGCATGGATGTCATGGCACGCGCTCTCCTCATCGCTGCTGACCTTCTCGAGAAATCTCCTATTCCTGCAATGGTCAAGGAGCGTTACGCTTCCTACGACTCAGGCATGGGCAAGGACTTCGAGAACGGCAAGCTTACTCTCGAGCAGGTTGTCGATTTCGCAAGAAAGAACGGCGAGCCTAAGAGCACCAGCGGAAAGCAGGAGCTCTACGAGTCTATCGTCAATCTCTACATCTAA 1754MI2_Bacteroidales Amino    2MAVKEYFPEIGKIAFEGKESKNPMAFHYYNPEQVVAGKKMKDWFKFAMAWWHTLCAEGGD 001 AcidQFGPGTKKFPWNTGATALERAKNKMDAGFEIMSKLGIEYFCFHDVDLIDEADTVEEYEANMKAITAYAKEKMAATGIKLLWGTANVFGNKRYMNGASTNDDFNVAARAMLQIKNAIDATIALGGPCYVFWGGREGYMSLLNTDMKREKEHMATMLTMARDYARSKGFKGTFLIEPKPMEPMKHQYDVDTETVVGFLRAHGLDKDFKVNIEVNHATLAGHTFEHELQCAVDAGMLGSIDANRGDYQNGWDTDQFPIDLYELVQAMMVIIKGGGLVGGTNFDAKTRRNSTDLEDIFIAHVSGMDVMARALLIAADLLEKSPIPAMVKERYASYDSGMGKDFENGKLTLEQVVDFARKNGEPKSTSGKQELYESIVNLYI 5586MI6_ Bacteroidales DNA   3ATGGCAAACAAAGAGTACTTCCCGGAGATCGGGAAAATCAAATTCGAAGGCAAGGATTCC 004AAGAACCCGCTTGCATTCCATTATTACAATCCTGAGCAGGTCGTCTGCGGCAAGCCGATGAAGGACTGGCTCAAGTTCGCTATGGCATGGTGGCACACCCTCTGCGCAGAGGGTAGCGACCAGTTCGGCGGACCCACCAAGTCATTCCCTTGGAACAAAGCTTCGGATCCCATCGCAAAGGCCAAGCAGAAAGTCGACGCCGGTTTCGAGATCATGCAGAAGCTCGGTATCGGATACTATTGCTTCCACGATGTAGACCTCATCGACGAGCCCGCCACCATCGAGGAGTATGAGGCCGATCTCAAGGAGATCGTCGCTTACCTCAAGGAGAAGCAGGCCCAGACCGGCATCAAGCTCCTTTGGGGCACCGCCAACGTCTTCGGTCACAAGCGGTACATGAACGGCGCCTCCACCAACCCTGATTTCGACGTCGCAGCCCGCGCCATGGTCCAGATCAAGAACGCCATGGACGCCACCATCGAGCTCGGCGGCGAGTGCTATGTCTTCTGGGGCGGCCGCGAGGGCTACATGAGCCTCCTCAACACCGACATGAAGCGTGAGAAGCAGCATATGGCCACCATGCTCGGCATGGCCCGCGACTATGCACGCGGCAAGGGCTTCAAGGGCACCTTCCTCATCGAGCCCAAGCCCATGGAGCCGACCAAGCACCAGTATGACGTCGACACCGAGACCGTCATCGGTTTCCTCCGTGCCAACGGTCTTGACAAGGACTTCAAGGTCAACATCGAGGTCAATCACGCCACCCTCGCCGGCCACACCTTCGAGCATGAGCTCCAGTGCGCCGCCGATGCCGGTCTCCTCGGATCCATCGACGCCAACCGCGGCGACTATCAGAACGGCTGGGATACCGACCAGTTCCCGATCGACCTCTATGAGCTCACCCAGGCCATGATGGTCATCCTCAAGAATGGCGGCCTCGTCGGCGGTACCAACTTCGACGCCAAGACCCGTCGCAACTCCACCGACCTGGACGACATCATCATCGCCCACGTCAGCGGTATGGACATCATGGCACGCGCACTCCTCGTCGCTGCCGACGTCCTCACCAAGTCCGAGCTTCCCAAGATGCTCAAGGAGCGTTACGCTTCCTTCGACTCCGGCAAGGGCAAGGAGTTCGAAGAGGGCAAGCTCACTCTCGAGCAGGTCGTAGAGTACGCCAAGACCAAGGGCGAGCCCAAGGCCACCAGCGGCAAGCAGGAGCTCTACGAGACCATCGTCAACATGTACATCTAA 5586MI6_Bacteroidales Amino    4MANKEYFPEIGKIKFEGKDSKNPLAFHYYNPEQVVCGKPMKDWLKFAMAWWHTLCAEGSD 004 AcidQFGGPTKSFPWNKASDPIAKAKQKVDAGFEIMQKLGIGYYCFHDVDLIDEPATIEEYEADLKEIVAYLKEKQAQTGIKLLWGTANVFGHKRYMNGASTNPDFDVAARAMVQIKNAMDATIELGGECYVFWGGREGYMSLLNTDMKREKQHMATMLGMARDYARGKGFKGTFLIEPKPMEPTKHQYDVDTETVIGFLRANGLDKDFKVNIEVNHATLAGHTFEHELQCAADAGLLGSIDANRGDYQNGWDTDQFPIDLYELTQAMMVILKNGGLVGGTNFDAKTRRNSTDLDDIIIAHVSGMDIMARALLVAADVLTKSELPKMLKERYASFDSGKGKEFEEGKLTLEQVVEYAKTKGEPKATSGKQELYETIVNMYI 5749MI1_ Bacteroidales DNA   5ATGAATTTTTATAAAGGCGAAAAAGAATTCTTCCCCGGAATAGGAAAGATTCAGTTTGAA 003GGACGCGAGTCAAAGAACCCGATGGCGTTTCATTATTATGACGAAAACAAGGTGGTGATGGGTAAAACACTGAAGGATCATCTTCGTTTTGCAATGGCTTACTGGCATACGCTTTGTGCCGAAGGGGGCGACCAGTTTGGCGGTGGTACGAAAACATTCCCCTGGAATGCTGCTGCCGACCCGATCAGCCGTGCCAAATATAAGATGGATGCAGCGTTCGAGTTTATGACAAAATGCAGCATCCCTTATTACTGTTTCCATGATGTGGACGTGGTGGACGAAGCTCCCACGCTGGCTCAGTTTGAAAAAGACCTTCATACGATGGTAGGCCATGCCAAAGGGCTTCAGCAGGCAACCGGAAAAAAACTGTTATGGTCTACTGCCAACGTGTTCAGCAACAAACGCTATATGAACGGGGCTGCCACTAATCCTGACTTCTCGGCCGTGGCTTGTGCCGGTACGCAGATCAAGAATGCGATCGATGCCTGTATCGCGCTGGACGGTGAAAACTATGTGTTCTGGGGCGGACGTGAAGGATATATGGGCTTGCTCAATACCGATATGAAACGCGAAAAAGACCATCTGGCCATGATGCTGACGATGGCACGCGACTATGGCCGCAAGAACGGTTTCAAAGGTACTTTCCTGATCGAGCCGAAACCGATGGAACCGACCAAGCATCAATATGATGTCGACTCGGAAACTGTAATCGGCTTCCTACGTCATTATGGCCTGGATAAAGACTTCGCCCTGAATATCGAAGTAAATCATGCAACCCTGGCCGGACATACGTTCGAGCACGAATTGCAGGCTGCTGTCGATGCCGGTATGCTGTGCAGTATCGATGCCAACCGTGGTGACTACCAGAATGGCTGGGATACCGACCAATTCCCGATGGACATCTACGAACTGACTCAGGCTTGGCTGGTCATTCTGCAAGGTGGTGGTCTGACAACCGGCGGAACGAACTTCGATGCCAAGACCCGCCGCAACTCGACCGACCTGGACGATATCTTCCTGGCTCATATAGGTGGTATGGATGCGTTTGCCCGTGCCCTGATCACGGCTGCTGCCATCCTTGAAAACTCCGATTACACGAAGATGCGTGCCGAACGTTACACCAGCTTCGATGGTGGCGAAGGCAAAGCGTTTGAAGACGGTAAACTTTCTCTGGAAGACCTGCGTACGATCGCTCTCCGCGACGGAGAACCGAAGATGGTCAGCGGCAAACAGGAATTATATGAGATGATTCTCAATTTA TACATATAA5749MI1_ Bacteroidales Amino    6MNFYKGEKEFFPGIGKIQFEGRESKNPMAFHYYDENKVVMGKTLKDHLRFAMAYWHTLCA 003 AcidEGGDQFGGGTKTFPWNAAADPISRAKYKMDAAFEFMTKCSIPYYCFHDVDVVDEAPTLAQFEKDLHTMVGHAKGLQQATGKKLLWSTANVFSNKRYMNGAATNPDFSAVACAGTQIKNAIDACIALDGENYVFWGGREGYMGLLNTDMKREKDHLAMMLTMARDYGRKNGFKGTFLIEPKPMEPTKHQYDVDSETVIGFLRHYGLDKPFALNIEVNHATLAGHTFEHELQAAVDAGMLCSIDANRGDYQNGWDTDQFPMDIYELTQAWLVILQGGGLTTGGTNFDAKTRRNSTDLDDIFLAHIGGMDAFARALITAAAILENSDYTKMRAERYTSFDGGEGKAFEDGKLSLEDLRTIALRDGEPKMVSGKQELYEMILNLYI 5750MI1_ Bacteroidales DNA   7ATGAATTACTTTAAAGGTGAGAAAGAGTTCTTCCCGGGAATCGGGAAAATAGAGTTTGAA 003GGACGTGAATCGAAGAATCCGATGGCTTTTCATTACTATGACGAGAACAAGGTTGTCATGGGGAAGACCTTGAAGGACCATCTGCGTTTTGCGATGGCTTATTGGCATACGCTGTGTGCGGAAGGCGCCGACCAGTTCGGCGGCGGGACGAAGGCATTTCCCTGGAATACCGGGGCGGATCGTATTTCCCGTGCCAAGTATAAGATGGATGCTGCTTTTGAGTTTATGACGAAATGTAACATCCCGTACTATTGTTTCCATGATGTGGATGTGGTGGATGAAGCTCCGACACTGGCCGAATTTGAAAAAGACTTGCATACGATGGTCGAATATGCCAAGCAGCATCAGGAGGCAACCGGGAAAAAACTGTTGTGGTCTACCGCCAATGTGTTCAGCAATAAACGTTATATGAACGGGGCTGCCACAAATCCGTATTTCCCTGCTGTCGCTTGTGCGGGTACGCAGATCAAGAATGCTATCGACGCTTGTATTGCCCTGGGCGGCGAAAACTATGTGTTCTGGGGCGGTCGTGAAGGGTATATGAGCTTGTTGAACACCAATATGAAACGCGAAAAGGAACATCTCGCCATGATGTTGACGATGGCTCGCGATTATGCGCGTAAGAACGGCTTCAAAGGTACTTTCCTGGTAGAGCCTAAACCGATGGAACCGACCAAACATCAGTATGATGTGGACACAGAAACTGTTATCGGCTTCCTGCGTCATTACGGCCTTGACAAGGACTTTGCCATCAACATCGAAGTGAATCATGCTACATTGGCTGGACATACATTCGAACATGAGCTTCAGGCGGCTGCCGATGCCGGTATGCTGTGCAGCATCGACGCCAACCGCGGCGATTACCAGAATGGTTGGGACACGGATCAGTTCCCGGTCGACATCTACGAACTGACACAGGCGTGGCTGGTTATCCTCGAAGCGGGTGGCCTGACTACCGGTGGTACGAACTTCGACGCCAAGACGCGCCGCAACTCGACTGACCTGGACGATATCTTCCTGGCACACATCGGTGGTATGGATTCGTTTGCCCGTGCTTTGATGGCGGCTGCCGATATATTGGAACACTCCGATTACAAAAAGATGCGTGCCGAACGTTATGCCAGCTTCGATCAAGGCGACGGCAAGAAGTTCGAAGATGGTAAACTCCTTCTCGAGGACCTCCGCACCATCGCTCTTGCCTCCGGCGAACCGAAGCAAATCAGCGGGAAACAGGAATTGTATGAAATGATTATCAACCAG TACATTTAA5750MI1_ Bacteroidales Amino      8MNYFKGEKEFFPGIGKIEFEGRESKNPMAFHYYDENKVVMGKTLKDHLRFAMAYWHTLCA 003 AcidEGADQFGGGTKAFPWNTGADRISRAKYKMDAAFEFMTKCNIPYYCFHDVDVVDEAPTLAEFEKDLHTMVEYAKQHQEATGKKLLWSTANVFSNKRYMNGAATNPYFPAVACAGTQIKNAIDACIALGGENYVFWGGREGYMSLLNTNMKREKEHLAMMLTMARDYARKNGFKGTFLVEPKPMEPTKHQYDVDTETVIGFLRHYGLDKPFAINIEVNHATLAGHTFEHELQAAADAGMLCSIDANRGDYQNGWDTDQFPVDIYELTQAWLVILEAGGLTTGGTNFDAKTRRNSTDLDDIFLAHIGGMDSFARALMAAADILEHSDYKKMRAERYASFDQGDGKKFEDGKLLLEDLRTIALASGEPKQISGKQELYEMIINQYI 5750MI2_ Bacteroidales DNA   9ATGAATTATTTTAAAGGTGAAAAAGAGTTTTTCCCTGGAATCGGGAAAATAGAGTTTGAA 003GGACGTGAGTCGAAGAATCCGATGGCTTTTCATTATTATGATGAAAACAAGGTCGTAATGGGCAAGACCTTGAAAGATCACCTCCGCTTTGCAATGGCTTACTGGCATACGTTGTGCGCGGAAGGCGCAGACCAGTTTGGCGGTGGCACAAAATCATTCCCCTGGAATACCGCAGCGGATCGTATTTCCCGCGCTAAATATAAAATGGATGCTGCTTTCGAGTTTATGACCAAGTGCAGTATCCCGTACTATTGTTTCCATGATGTGGACGTGGTGGACGAAGCTCCGGCACTGGCCGAATTTGAAAAGGACCTGCATACGATGGTGGGATTCGCCAAACAACACCAGGAAGCAACCGGAAAGAAACTGTTGTGGTCTACAGCCAATGTATTCGGGCATAAACGTTATATGAACGGAGCGGCTACCAATCarTATTTCCCGGCTGTCGCTTGTGCCGGTACGCAGATCAAGAATGCAATCGACGCCTGTATCGAGCTGGGTGGAGAGAACTATGTATTCTGGGGCGGACGCGAAGGCTACATGAGCCTGCTGAACACCAATATGAAACGTGAAAAGGATCATTTGGCCATGATGCTGACAATGGCACGCGATTATGCCCGCAAGAATGGTTTCAAGGGTACTTTCCTGGTGGAATCTAAGCCGATGGAACCGACCAAACATCAGTATGACGCAGATACGGAAACCGTGATCGGCTTCCTGCGCCACTATGGCCTCGACAAGGATTTCGCTATCAACATTGAAGTGAACCATGCTACATTGGCCGGCCATACATTCGAACATGAACTTCAGGCTGCTGCCGATGCCGGTATGCTGTGCAGCATCGATGCAAATAGAGGCGACTATCAGAATGGTTGGGATACGGATCAGTTCCCCGTAGACATTTACGAACTGACACAGGCCTGGCTGGTTATCCTGGAAGCGGGCGGACTGACAACCGGAGGTACGAACTTCGATGCGAAGACCCGTCGTAACTCGACTGACCTCGACGATATCTTCCTGGCCCATATCGGCGGTATGGATTCGTTTGCACGTGCCTTGATGGCAGCTGCCGATATCCTGGAACATTCTGATTACAAGAAGATGCGTGCCGAACGTTACGCCAGCTTCGACCAGGGCGACGGCAAGAAGTTCGAAGACGGCAAACTCCTTCTCGAAGACCTGCGCACAATTGCCCTTGCCGGCGACGAACCGAAGCAGATCAGCGGCAAGCAGGAGTTGTATGAGATGATTATCAATCAG TATATTTAA5750MI2_ Bacteroidales Amino   10MNYFKGEKEFFPGIGKIEFEGRESKNPMAFHYYDENKVVMGKTLKDHLRFAMAYWHTLCA 003 AcidEGADQFGGGTKSFPWNTAADRISRAKYKMDAAFEFMTKCSIPYYCFHDVDVVDEAPALAEFEKDLHTMVGFAKQHQEATGKKLLWSTANVFGHKRYMNGAATNPYFPAVACAGTQIKNAIDACIELGGENYVFWGGREGYMSLLNTNMKREKDHLAMMLTMARDYARKNGFKGTFLVESKPMEPTKHQYDADTETVIGFLRHYGLDKDFAINIEVNHATLAGHTFEHELQAAADAGMLCSIDANRGDYQNGWDTDQFPVDIYELTQAWLVILEAGGLTTGGTNFDAKTRRNSTDLDDIFLAHIGGMDSFARALMAAADILEHSDYKKMRAERYASFDQGDGKKFEDGKLLLEDLRTIALAGDEPKQISGKQELYEMIINQYI 5586MI5_ Bacteroides DNA  11ATGAAACAGTATTTCCCGAACATCTCCGCCATCAAGTTTGAGGGCGTCGAGAGCAAGAAT 004CCCCTGGCTTACCGCTACTACGACCGCGACCGCGTCGTCATGGGTAAGAAGATGAGCGAATGGTTTAAGTTCGCTATGTGCTGGTGGCACACCCTCTGCGCCGAGGGCTCCGATCAGTTCGGTCCCGGCACAAAGACCTTCCCCTGGAACGCCGCCGCCGACCCCGTGCAGGCTGCCAAGGACAAGGCCGACGCTGGCTTCGAGATCATGCAGAAACTCGGCATCGAGTACTACTGCTTCCACGACGTTGACCTCGTGGCCGAGGCTCCCGACGTGGAGACCTACGAGAAGAACCTCAAGGAGATCGTGGCTTATCTCAAGCAGAAACAGGCTGAGACGGGCATCAAGCTGCTCTGGGGCACTGCCAACGTCTTCGGACACAAGCGCTACATGAACGGAGCCTCCACGAACCCCGACTTCGATGTCGTGGCACGCGCTATCGTGCAGATCAAGAACGCCATCGATGCTACCATCGAGCTGGGCGGCACCAACTACGTCTTCTGGGGCGGTCGCGAAGGCTACATGAGCCTGCTCAACACCGATATGAAGCGCGAGAAGGAGCACATGGCTACGATGTTGACGATGGCACGCGACTATGCCCGTTCTAAGGGATTCAAGGGCACGTTCCTCATCGAACCCAAACCCATGGAACCCACGAAGCATCAGTACGATGCGGACACCGAGACGGTCATCGGATTCCTCCGTGCTCATGGTCTCGACAAGGATTTCAAGGTCAACATCGAGGTCAACCACGCCACGCTGGCCGGACACACGTTCGAGCATGAGCTGGCCTGCGCCGTAGACGCCGATATGCTCGGCAGCATCGATGCCAATCGCGGCGACTATCAGAACGGATGGGACACCGACCAGTTCCCCATCGACCACTACGAACTCACGCAGGCTATGCTGCAGATCATCCGCAACGGAGGTTTCAAGGACGGTGGCACCAATTTTGACGCTAAGACGCGCCGCAACAGCACCGACCTCGAGGATATCTTCATCGCTCACGTAGCAGCCATGGACGCCATGGCCCACGCCCTGTTGTCGGCTGCCGATATCATCGAGAAGTCGCCCATCTGCACGATGGTCAAGGAGCGTTACGCCAGCTTCGATGCCGGCGAAGGCAAGCGCTTCGAAGAAGGCAAGATGACCCTCGAGGAAGCCTACGAGTATGGCAAGAAGGTCGGGGAGCCCAAGCAGACCAGCGGAAAGCAGGAGCTCTACGAAGCCATTGTCAATATGTATTGA 5586MI5_ BacteroidesAmino   12 MKQYFPNISAIKFEGVESKNPLAYRYYDRDRVVMGKKMSEWFKFAMCWWHTLCAEGSDQF004 Acid GPGTKTFPWNAAADPVQAAKDKADAGFEIMQKLGIEYYCFHDVDLVAEAPDVETYEKNLKEIVAYLKQKQAETGIKLLWGTANVFGHKRYMNGASTNPDFDVVARAIVQIKNAIDATIELGGTNYVFWGGREGYMSLLNTDMKREKEHMATMLTMARDYARSKGFKGTFLIEPKPMEPTKHQYDADTETVIGFLRAHGLDKDFKVNIEVNHATLAGHTFEHELACAVDADMLGSIDANRGDYQNGWDTDQFPIDHYELTQAMLQIIRNGGFKDGGTNFDAKTRRNSTDLEDIFIAHVAAMDAMAHALLSAADIIEKSPICTMVKERYASFDAGEGKRFEEGKMTLEEAYEYGKKVGEPKQTSGKQELYEAIVNMY 5586MI202_ Bacteroides DNA  13ATGGCAACAAAAGAGTATTTTCCCGGAATAGGAAAGATTAAATTCGAAGGTAAAGAGAGT 004ATGAACCCGATGGCATATCGTTACTACGATGCTGAGAAGGTAATCATGGGTAAGAAGATGAAAGATTGGTTGAAGTTTGCTATGGCTTGGTGGCACACTCTCTGCGCAGAAGGTGGTGACCAATTCGGTGGCGGAACGAAACAATTCCCTTGGAATGGTGACTCTGACGCTTTGCAAGCAGCTAAAAATAAATTGGATGCAGGTTTCGAATTCATGCAGAAGATGGGTATCGAATACTATTGCTTCCACGATGTAGACCTGATTTCTGAAGGTGCAAGCATCGAAGAATACGAAGCTAACTTGAAAGCTATCGTAGCTTATGCAAAAGAAAAACAGGCTGAAACTGGTATCAAGCTGTTGTGGGGTACTGCTAACGTATTCGGTCATGCACGTTATATGAACGGTGCTGCTACCAATCCTGATTTCGACGTTGTAGCACGCGCTGCTGTTCAGATCAAGAACGCTATTGACGCTACTATCGAACTGGGTGGTTCAAACTATGTATTCTGGGGCGGTCGCGAAGGTTACATGTCTTTGCTGAACACTGACCAGAAACGTGAAAAAGAACACCTTGCAAAGATGTTGACTATCGCTCGTGACTATGCACGTGCTCGTGGCTTCAAAGGTACTTTCCTGATTGAGCCGAAACCGATGGAACCGACAAAACATCAGTATGATGTAGATACTGAAACAGTTATCGGCTTCCTGAAAGCTCACGGTTTGGATAAGGATTTCAAAGTAAACATCGAGGTTAATCACGCAACTTTGGCTGGCCATACTTTCGAACACGAACTGGCTGTAGCTGTTGACAACGGCATGTTAGGTTCTATCGACGCTAACCGTGGTGACTACCAGAACGGTTGGGATACTGACCAATTCCCTATCGATAACTACGAACTGACTCAAGCTATGATGCAGATCATCCGCAACGGTGGTTTGGGTAATGGCGGTACTAACTTCGACGCTAAGACCCGTCGTAACTCTACCGACCTGGAAGATATCTTCATCGCTCACATTGCAGGTATGGATGCTATGGCACGTGCTCTGGAAAGTGCAGCTAAATTACTGGAAGAATCTCCTTATAAGAAAATGTTGGCTGATCGTTACGCATCATTCGACGGTGGCAAGGGTAAGGAATTCGAAGAAGGCAAATTGTCTTTGGAAGATGTTGTAGCTTATGCGAAAGCTAACGGCGAACCGAAGCAAACCAGCGGCAAGCAAGAATTGTATGAAGCAATCGTGAATATGTATTGCTAA 5586MI202_Bacteroides Amino   14MATKEYFPGIGKIKFEGKESMNPMAYRYYDAEKVIMGKKMKDWLKFAMAWWHTLCAEGGD 004 AcidQFGGGTKQFPWNGDSDALQAAKNKLDAGFEFMQKMGIETYCFHDVDLISEGASIEEYEANLKAIVAYAKEKQAETGIKLLWGTANVFGHARYMNGAATNPDFDVVARAAVQIKNAIDATIELGGSNYVFWGGREGYMSLLNTDQKREKEHLAKMLTIARDYARARGFKGTFLIEPKPMEPTKHQYDVDTETVIGFLKAHGLDKDFKVNIEVNHATLAGHTFEHELAVAVDNGMLGSIDANRGDYQNGWDTDQFPIDNYELTQAMMQIIRNGGLGNGGTNFDAKTRRNSTDLEDIFIAHIAGMDAMARALESAAKLLEESPYKKMLADRYASFDGGKGKEFEEGKLSLEDVVAYAKANGEPKQTSGKQELYEAIVNMYC 5586MI211_ Bacteroides DNA  15ATGGCAAAAGAGTATTTTCCTGGCGTGAAAAAAATCCAGTTCGAGGGTAAGGACAGTAAG 003AATCCAATGGCTTACCGTTATTATGATGCAGAGAAGGTCATCATGGGTAAGAAGATGAAGGATTGGTTGAAGTTCGCTATGGCTTGGTGGCACACTTTGTGCGCTGAGGGCGCAGACCAGTTCGGTGGCGGTACTAAGACTTTCCCTTGGAACGAAGGTGCAAACGCTTTGGAAGTTGCTAAGAATAAGGCTGATGCTGGTTTCGAGATTATGGAGAAGCTTGGCATCGAGTACTACTGTTTCCACGATGTAGACCTCGTTGAGGAGGCTGCAACTATCGAGGAGTATGAGGCTAACATGAAGGCTATCGTTGCTTATCTTAAGGAGAAGCAGGCTGCTACTGGCAAGAAGCTTCTTTGGGGTACTGCTAACGTATTCGGCAACAAGCGCTATATGAACGGTGCTTCTACAAACCCTGACTTCGACGTTGTTGCTCGCGCTTGTGTTCAGATTAAGAACGCTATCGACGCTACTATCGAACTTGGTGGTACAAACTACGTATTCTGGGGTGGCCGCGAGGGTTATATGAGCCTTCTTAACACAGATATGAAGCGTGAGAAGGAGCACATGGCAACTATGCTTACTAAGGCTCGCGACTACGCTCGTTCAAAGGGCTTTACTGGTACATTCCTTATCGAGCCAAAGCCAATGGAACCATCAAAGCATCAGTATGATGTTGATACTGAGACTGTTTGTGGTTTCTTGAGGGCTCACGGTCTTGACAAGGACTTCAAGGTAAACATCGAGGTTAACCACGCTACTTTGGCTGGTCACACATTCGAGCACGAGTTGGCTGCTGCTGTTGATAACGGTATGCTTGGCTCTATCGACGCTAACCGCGGTGACTACCAGAACGGTTGGGATACTGACCAGTTCCCTATCGACAACTTCGAGCTTATTCAGGCTATGATGCAGATTATCCGCAACGGTGGTCTTGGCAACGGTGGTACAAACTTCGACGCTAAGACTCGTCGTAACTCAACTGACCTTGAGGATATCTTCATCGCACACATCGCTGGTATGGATGCAATGGCTCGCGCTCTTGAGAACGCAGCAGACCTTTTGGAGAACTCTCCAATCAAGAAGATGGTTGCTGAGCGTTACGCTTCATTCGACAGCGGCAAGGGTAAGGAGTTCGAGGAAGGCAAGTTGAGCCTTGGGGACATCGTTGCTTATGCTAAGCAGAACGGTGAGCCTAAGCAGACAAGCGGTAAGCAGGAGCTTTACGAGGCTATCGTAAACATGTACTGCTAA 5586MI211_Bacteroides Amino   16MAKEYFPGVKKIQFEGKDSKNPMAYRYYDAEKVIMGKKMKDWLKFAMAWWHTLCAEGADQ 003 AcidFGGGTKTFPWNEGANALEVAKNKADAGFEIMEKLGIEYYCFHDVDLVEEAATIEEYEANMKAIVAYLKEKQAATGKKLLWGTANVFGNKRYMNGASTNPDFDVVARACVQIKNAIDATIELGGTNYVFWGGREGYMSLLNTDMKREKEHMATMLTKARDYARSKGFTGTFLIEPKPMEPSKHQYDVDTETVCGFLRAHGLDKDFKVNIEVNHATLAGHTFEHELAAAVDNGMLGSIDANRGDYQNGWDTDQFPIDNFELIQAMMQIIRNGGLGNGGTNFDAKTRRNSTDLEDIFIAHIAGMDAMARALENAADLLENSPIKKMVAERYASFDSGKGKEFEEGKLSLGDIVAYAKQNGEPKQTSGKQELYEAIVNMYC 5606MI1_ Bacteroides DNA  17ATGGCGACAAAAGAATACTTTCCCGGAATAGGGAAAATCAAGTTTGAGGGTGTGAATAGC 005TATAATCCGCTGGCATACAGATATTACGATGCCGAGCGCATAGTCCTTGGCAAGCCGATGAAGGAGTGGCTCAAGTTTGCCATGGCATGGTGGCACACACTCTGCGCAGAGGGTGGCGACCAGTTTGGCGGCGGTACGAAGAATTTTCCCTGGAATGGAGATCCCGATCCGGTACAGGCCGCAAAAAACAAAGTAGACGCCGGCTTCGAATTCATGACCAAGATGGGAATAGAGTATTTCTGTTTCCACGACGTGGATCTCGTCAGCGAGGCAGCAACCATCGAGGAGTATGAGGCCAACCTGAAGGAAGTGGTGGGCTACATCAAGGAAAAGCAGGCCGAGACGGGGATCAAAAACCTCTGGGGCACTGCCAACGTGTTCAGCCACGCGCGCTACATGAACGGAGCCGCCACCAACCCCGACTTCGATGTAGTGGCCCGCGCAGCCGTGCAGATCAAGAATGCTATCGACGCCACGATAGCCTTAGGTGGCACCAACTACGTGTTCTGGGGTGGCCGTGAAGGTTACATGAGCCTGCTCAACACCGACCAGAAGCGCGAGAAGGAGCATCTGGCAATGATGCTCCGCATGGCCCGCGACTATGCGCGTGCAAAAGGCTTCACCGGCACCTTCCTTATCGAGCCCAAGCCGATGGAGCCCACCAAGCACCAGTATGATGTAGACACCGAGACTGTGATAGGCTTCCTCCGTGCCCACGGCCTCGACAAGGACTTCAAGGTCAACATAGAGGTGAACCACGCCACCCTGGCCGGCCATACCTTCGAGCATGAGCTGGCAGTGGCCGTGGACAACGGTATGCTCGGCAGCATCGACGCCAACCGCGGTGACTACCAGAACGGCTGGGATACCGACCAGTTCCCCATCGACAACTACGAGCTGACCCAGGCCATGATGCAGATAATACGCAACGGCGGCTTCGGCAACGGCGGATGCAACTTCGACGCCAAGACACGCCGCAACTCCACCGACCTGGAGGATATCTTCATAGCCCACATAGCAGGCATGGACGCCATGGCCCGCGCCCTGCTCAGCGCAGCAGAAGTGCTGGAGAAATCGCCCTACAGGAAGATGCTCGCCGAGCGCTACGCACCGTTTGATGCCGGCCAGGGAAAGGCATTTGAAGAGGGCGCAATGTCGCTCACCGACCTTGTGGAGTATGCCAAGGAGCATGGCGAGCCCACACAGACTTCCGGCAAGCAGGAACTCTATGAGGCAATCGTCAATATGTATTGCTAA 5606MI1_Bacteroides Amino   18MATKEYFPGIGKIKFEGVNSYNPLAYRYYDAERIVLGKPMKEWLKFAMAWWHTLCAEGGD 005 AcidQFGGGTKNFPWNGDPDPVQAAKNKVDAGFEFMTKMGIEYFCFHDVDLVSEAATIEEYEANLKEVVGYIKEKQAETGIKNLWGTANVFSHARYMNGAATNPDFDVVARAAVQIKNATDATIALGGTNYVFWGGREGYMSLLNTDQKREKEHLAMMLRMARDYARAKGFTGTFLIEPKPMEPTKHQYDVDTETVIGFLRAHGLDKDFKVNIEVNHATLAGHTFEHELAVAVDNGMLGSIDANRGDYQNGWDTDQFPIDNYELTQAMMQIIRNGGFGNGGCNFDAKTRRNSTDLEDIFIAHIAGMDAMARALLSAAEVLEKSPYRKMLAERYAPFDAGQGKAFEEGAMSLTDLVEYAKEHGEPTQTSGKQELYEAIVNMYC 5606MI2_ Bacteroides DNA  19ATGGCAACAAAGGAATATTTTCCCCATATAGGGAAGATCCAGTTCAAAGGCACGGAATCG 003TACGATCCGATGTCGTATCGTTACTATGACGCCGAGCGCGTAGTTCTGGGCAAGCCCATGAAGGAATGGCTGAAATTCGCCATGGCATGGTGGCACACATTGTGCGCCGAGGGCGGCGACCAGTTCGGCGGCGGAACGAAGAAGTTCCCCTGGAACGAGGGCGAGGACGCCATGACCATCGCCAAGCAGAAGGCTGACGCCGGCTTCGAGATCATGCAGAAGCTCGGCATCGAGTATTTCTGCTTCCACGACATCGACCTGATCGGCGACCTGGGCGACGACATCGAGGACTATGAGAACCGTATGCACGAAATCACCGCACACCTGAAGGAGAAGATGGCCGCCACGGGCATCAAGAACCTGTGGGGCACTGCCAACGTGTTCGGCCACGCACGCTATATGAACGGCGCCGCCACCAACCCCGACTTCGACGTTGTGGCACGCGCATGTGTGCAGATCAAGAACGCCATCGACGCCACCATCGCTCTAGGCGGTACAAACTATGTATTCTGGGGCGGCCGCGAGGGCTACATGAGCCTGCTGAACACCGACCAGAAGCGCGAGAAAGAGCACTTGGCTACCATGCTGACCATGGCACGCGACTATGCCCGCGCCAATGGCTTCACCGGAACGTTCCTGATCGAGCCCAAACCCATGGAGCCCAGCAAGCATCAGTATGATGTGGATACCGAGACCGTAATCGGCTTCCTGAAGGCCCACAACCTGGACAAGGACTTCAAGGTGAACATCGAGGTGAACCATGCCACTCTGGCCGGCCACACATTCGAGCATGAGCTGGCAGTAGCCGTGGACAACGGCATGCTGGGCAGCATCGACGCCAACCGCGGCGACTATCAGAACGGCTGGGACACCGACCAGTTCCCCATCGACAACTATGAGCTGACCCAGGCCATGATGCAGATAATCCGCAACGGTGGCCTCGGCAACGGCGGTACCAACTTCGACGCCAAGACACGTCGCAACTCCACCGACCTGGACGACATCTTCATCGCTCACATCGCCGGTATGGACGCTATGGCCCGCGCTCCGCTCAGCGCAGCCGACGTGCTTGAGAAGTCGCCTTACAAGAAGATGCTGGCCGACCGCTACGCTTCATTCGACAGCGGCGAGGGCAAGAAGTTCGAGGAAGGCAAGATGACTCTGGAGGATGTCGTGGCCTACGCCAAGAAGAATCCCGAACCCGCTCAGACCAGCGGCAAGCAGGAACTCTACGAGGCCATCATCAACATGTACGCCTGA 5606MI2_Bacteroides Amino   20MATKEYFPHIGKIQFKGTESTDPMSYRYTDAERVVLGKPMKEWLKFAMAWWHTLCAEGGD 003 AcidQFGGGTKKFPWNEGEDAMTIAKQKADAGFEIMQKLGIEYFCFHDIDLIGDLGDDIEDYENRMHEITAHLKEKMAATGIKNLWGTANVFGHARTMNGAATNPDFDVVARACVQIKNAIDATIALGGTNYVFWGGREGYMSLLNTDQKREKEHLATMLTMARDTARANGFTGTFLIEPKPMEPSKHQYDVDTETVIGFLKAHNLDKDFKVNIEVNHATLAGHTFEHELAVAVDNGMLGSIDANRGDYQNGWDTDQFPIDNYELTQAMMQIIRNGGLGNGGTNFDAKTRRNSTDLDDIFIAHIAGMDAMARAPLSAADVLEKSPTKKMLADRTASFDSGEGKKFEEGKMTLEDVVAYAKKNPEPAQTSGKQELYEAIINMYA 5610MI3_ Bacteroides DNA  21ATGGCAACAAAAGAATTTTTTCCCGAGATTGGTAAAATCAAGTTTGAGGGCCGCGAAAGC 003CGCAATCCCCTCGCATTCCGCTACTACGGCCCCGAGAAAGTCGTTCTTGGCAAGAAGATGAAAGACTGGTTCAAGTTTGCGATGGCTTGGTGGCACACACTGTGCGCCCAGGGCACCGACCAGTTTGGTGGCGACACCAAGCAGTTTCCGTGGAACACTGCCAGTGACCCCATGCAGGCCGCCAAGGATAAGGTGGATGCCGGATTTGAATTCATGACCAAGATGGGCATTGAGTACTTCTGCTTCCACGATGTGGATCTCGTCGCCGAGGCCGCCACTGTCGAGGAGTATGAGGCTAACCTCAAGACCATCGTCGCCTACATCAAAGAGAAACAAGCCGAGACCGGCATCAAGAACCTGTGGGGCACAGCCAACGTATTCGGACACAAACGCTACATGAACGGTGCCGCCACCAACCCCGACTTTGATGTCGTGGCACGCGCCATCGTGCAAATCAAGAACGCCATCGACGCCACCATCGAGTTGGGCGGCACGAGTTACGTCTTTTGGGGCGGCCGCGAGGGCCACATGAGCCTGCTCAACACCGACCAGAAGCGCGAGAAGGAGCACCTTGCACGCATGCTGACCATGGCACGCGACTATGCCCGCGCACGTGGTTTCAACGGCACCTTCCTCATCGAGCCCAAGCCCATGGAGCCGACCAAGCACCAATATGATGTGGACACCGAGACCGTCATCGGTTTCCTGCGTGCCCATGGTCTGGACAAGGACTTCAAGGTCAACATCGAGGTGAACCACGCTACACTGGCCGGACACACCTTCGAGCGCGAACTGGCAGTGGCCGTCGACAACGGTCTACTCGGCTCAATCGACGCCAACCGTGGTGACTATCAGAATGGTTGGGACACCGATCAGTTCCCCATCGACCACTATGAGTTGGTTCAGGGCATGTTGCAGATTATCCGCAATGGTGGTTTCACCGACGGTGGCACCAACTTCGATGCCAAGACCCGCCGCAACTCGACCGACCTCGAGGACATCTTCATCGCCCACATCGCCGCGATGGATGCCATGGCTCATGCGCTGGAGAGTGCTGCCTCCATCATCGAGGAGTCGCCCTACTGCCAGATGGTCAAGGATCGCTATGCCTCATTTGACTCCGGCATCGGCAAGGACTTTGAGGACGGCAAGTTGACACTGGAACAAGCCTACGAGTACGGTAAGCAAGTGGGCGAACCCAAGCAGACCAGTGGCAAGCAAGAACTGTACGAGTCAATCATCAATATGTATTCCATTTAA 5610MI3_Bacteroides Amino   22MATKEFFPEIGKIKFEGRESRNPLAFRYYGPEKVVLGKKMKDWFKFAMAWWHTLCAQGTD 003 AcidQFGGDTKQFPWNTASDDMQAAKDKVDAGFEFMTKMGIEYFCFHDVDLVAEAATVEEYEANLKTIVAYIKEKQAETGIKNLWGTANVFGHKRYMNGAATNPDFDVVARAIVQIKNAIDATIELGGTSYVFWGGREGHMSLLNTDQKREKEHLARMLTMARDYARARGFNGTFLIEPKPMEPTKHQYEVETETVIGFLRAHGLDKEEKVNIEVNHATLAGHTFERELAVAVDNGLLGSIDANRGDYQNGWDTDQFPIDHYELVQGMLQIIRNGGFTDGGTNFDAKTRRNSTDLEDIFIAHIAAMDAMAHALESAASIIEESPYCQMVKDRYASFDSGIGKDFEDGKLTLEQAYEYGKQVGEPKQTSGKQELYESIINMYSI 5749MI2_ Bacteroides DNA  23ATGGCAACAAAAGAGTATTTTCCTGGTATAGGAAAGATTAAATTTGAAGGTAAAGAGAGT 004AAGAATCCGATGGCATTCCGCTATTATGATGCCAATAAAGTAATCATGGGCAAGAAGATGAGCGAGTGGCTGAAGTTTGCCATGGCTTGGTGGCACACATTGTGCGCCGAAGGTGGTGACCAGTTTGGTGGTGGAACAAAGACTTTCCCGTGGAACGATTCGGACAACGCCGTAGAAGCAGCCAACCATAAAGTAGATGCCGGTTTTGAATTTATGCAGAAAATGGGCATCGAATACTATTGCTTCCATGATGTAGACCTCTGCACTGAAGCTGCTACCATTGAAGAATATGAAGCCAATCTGAAGGAAATAGTAGCCTATCCGAAACAGAAACAGGCTGAAACAGGTATCAAACTTCTGTGGGGTACGGCAAATGTATTTGGTCACAAACGCTATATGAATGGTGCTGCTACCAATCCGGATTTTGATGTAGTGGCTCGTGCTGCTGTACAGATTAAGAATGCGATAGACGCTACAATTGAACTCGGTGGTAGCAACTACGTGTTCTGGGGCGGCCGTGAAGGTTATATGAGCTTGCTCAATACAGACCAGAAACGTGAGAAAGAGCATTTGGCACAAATGTTGACCATGGCTCGTGACTATGCTCGTGCCAAAGGATTCAAGGGTACCTTCCTGGTTGAACCCAAACCGATGGAACCAACTAAACACCAGTATGATGTAGATACGGAAACTGTAATCGGCTTCCTCAAGGCTCATAATTTGGATAAGGATTTCAAGGTAAATATTGAAGTAAACCATGCTACATTGGCCGGTCATACTTTTGAACACGAATTGGCTGTTGCCGTAGACAACGATATGCTTGGCTCTATCGATGCCAACCGCGGTGACTATCAGAACGGTTGGGATACTGACCAGTTCCCCATTGACAACTTCGAGCTTATCCAAGCCATGATGCAGATTATTCGCGGTGGTGGCTTCAAAGATGGTGGTACAAACTTCGACGCTAAGACTCGTCGTAACTCTACCGACCTGGAAGATATTTTCATTGCACACATCGCTGGTATGGATGCTATGGCACGTGCTTTGGAAAGTGCAGCCAAGTTGCTTGAGGAATCTCCTTATAAGAAAATGTTGGCTGACCGCTATGCATCGTTCGATAGTGGCAAAGGTAAGGAGTTTGAAGAAGGCAAGCTGACATTGGAAGACGTTGTAGTTTATGCCAAGCAGAATGGCGAGCCTAAACAGACCAGCGGTAAGCAGGAATTGTATGAGGCAATTGTAAATATGTATGCCTGA 5749MI2_Bacteroides Amino   24MATKEYFPGIGKIKFEGKESKNPMAFRYYDANKVIMGKKMSEWLKFAMAWWHTLCAEGGD 004 AcidQFGGGTKTFPWNDSDNAVEAANHKVDAGFEFMQKMGIEYYCFHDVDLCTEAATIEEYEANLKEIVAYPKQKQAETGIKLLWGTANVFGHKRYMNGAATNPDFDVVARAAVQIKNAIDATIELGGSNYVFWGGREGYMSLLNTDQKREKEHLAQMLTMARDYARAKGFKGTFLVEPKPMEPTKHQYDVDTETVIGFLKAHNLDKDFKVNIEVNHATLAGHTFEHELAVAVDNDMLGSIDANRGDYQNGWDTDQFPIDNFELIQAMMQIIRGGGFKDGGTNFDAKTRRNSTDLEDIFIAHIAGMDAMARALESAAKLLEESPYKKMLADRYASFDSGKGKEFEEGKLTLEDVVVYAKQNGEPKQTSGKQELYEAIVNMYA 5750MI3_ Bacteroides DNA  25ATGGCAACAAAAGAGTATTTTCCTGGAATAGGAAAGATTAAATTTGAAGGAAAAGAGAGT 003AAGAACCCGATGGCATTCCGTTGCTACGATGCAGAAAAAGTTATCATGGGTAAGAGAATGAAAGATTGGTTGAAGTTTGCAATGGCGTGGTGGCATACACTTTGTGCAGAAGGCGGTGACCAATTCGGTGGCGGTACAAAGAGTTTCCCCCGGAACGACTATACTGATAAAATTCAGGCTGCTAAAAACAAGATGGATGCCGGTTTTGAGTTTATGCAGAAGATGGGGATCGAATACTATTGTTTTCACGATGTAGACCTCTGCACGGAAGCTGATACCATTGAAGAATACGAAGCTAATTTGAAAGAAATCGTAGTTTACGCAAAGCAAAAGCAGGTAGAAACAGGTATCAAATTATTGTGGGGTACTGCCAATGTATTCGGTCATGAACGCTATATGAATGGTGCGGCTACCAACCCAGATTTTGATGTTGTAGCCCGTGCTGCTGTTCAGATTAAGAATGCAATTGATGCTACCATTGAACTAGGTGGCTTAAACTATGTGTTCTGGGGTGGACGCGAAGGTTATATGTCTTTGCTGAACACTGATCAGAAACGTGAGAAAGAACATCTTGCACAAATGCTGACCATTGCCCGTGACTATGCCCGTGCCCGTGGCTTCAAAGGTACATTCTTGGTTGAACCGAAACCGATGGAACCAACCAAACATCAATATGACGTAGATACAGAAACAGTTATCGGTTTTTTGAAAGCTCATGCTTTGGATAAAGACTTTAAAGTAAATATTGAAGTAAATCATGCAACATTAGCCGGTCATACATTTGAACACGAACTGGCAGTGGCTGTCGACAACGGTATGCTGGGTTCTATTGACGCTAATCGTGGTGATTGTCAAAACGGTTGGGATACAGACCAATTTCCCATTGATAACTATGAACTGACTCAAGCCATGATGCAGATTATTCGTAACGGTGGTTTGGGCAATGGTGGTACGAATTTTGACGCTAAAACTCGCCGTAATTCTACTGATCTTGGAGATATCTTCATTGCTCACATCGCAGGTATGGATGCTATGGCACGTGCATTGGAAAGTGCGGCCAAGTTGTTGGAAGAATCTCCCTATAAGAAGATGCTGGCAGAACGTTATGCATCCTTTGACAGCGGTAAGGGTAAAGAGTTTGAAGAGGGTAAGTTGACCTTGGAGGATCTTGTTGCTTATGCAAAAGTCAATGGCGAACCGAAACAAATCAGIGGTAAACAAGAATTGTATGAGGCAATTGTGAATATGTATTGCTAA 5750MI3_Bacteroides Amino   26MATKEYFPGIGKIKFEGKESKNPMAFRCYDAEKVIMGKRMKDWLKFAMAWWHTLCAEGGD 003 AcidQFGGGTKSFPRNDYTDKIQAAKNKMDAGFEFMQKMGIEYYCFHDVDLCTEADTIEEYEANLKEIVVYAKQKQVETGIKLLWGTANVFGHERYMNGAATNPDFDVVARAAVQIKNAIDATIELGGLNYVFWGGREGYMSLLNTDQKREKEHLAQMLTIARDYARARGFKGTFLVEPKPMEPTKHQYDVDTETVIGFLKAHALDKDFKVNIEVNHATLAGHTFEHELAVAVDNGMLGSIDANRGDCQNGWDTDQFPIDNYELTQAMMQIIRNGGLGNGGTNFDAKTRRNSTDLGDIFIAHIAGMDAMARALESAAKLLEESPYKKMLAERYASFDSGKGKEFEEGKLTLEDLVAYAKVNGEPKQISGKQELYEAIVNMYC 5750MI4_ Bacteroides DNA  27ATGGCAACAAAAGAGTATTTTCCCGGAATAGGAAAGATTAAATTCGAAGGTAAAGAGAGC 003AAGAACCCGATGGCATTCCGTTATTACGATGCCGATAAAGTAATCATGGGTAAGAAAATGAGCGAATGGCTGAAGTTCGCCATGGCATGGTGGCACACTCTTTGCGCAGAAGGTGGTGACCAGTTCGGTGGCGGAACAAAGAAATTCCCCTGGAACGGTGAGGCTGACAAGGTTCAGGCTGCCAAGAACAAAATGGACGCCGGCTTTGAATTCATGCAGAAAATGGGTATCGAATACTACTGCTTCCACGATGTAGACCTCTGCGAAGAAGCCGAGACCATTGAAGAATACGAAGCCAACTTGAAGGAAATCGTAGCGTATGCCAAGCAGAAACAAGCAGAAACCGGCATCAAGCTGTTGTGGGGTACTGCCAACGTATTCGGCCATGCCCGCTACATGAATGGTGCAGCCACCAACCCCGATTTCGATGTTGTGGCACGTGCAGCCGTCCAAATCAAAAGCGCCATCGACGCTACTATCGAGCTGGGAGGTTCGAACTATGTGTTCTGGGGCGGTCGCGAAGGCTACATGTCATTGCTGAATACAGACCAGAAGCGTGAGAAAGAGCACCTCGCACAGATGTTGACCATCGCCCGCGACTATGCCCGTGCCCGTGGCTTCAAAGGTACCTTCCTGATTGAACCGAAACCGATGGAACCTACAAAACACCAGTATGATGTAGACACCGAAACCGTTATCGGCTTCTTGAAGGCCCACAATCTGGACAAAGATTTCAAGGTAAACATCGAAGTGAACCACGCTACTTTGGCGGGCCACACCTTCGAGCACGAACTCGCAGTAGCCGTAGACAACGGTATGCTCGGCTCCATCGATGCCAACCGTGGTGACTACCAGAACGGCTGGGATACAGACCAGTTCCCCATTGACAACTTCGAACTGACCCAGGCAATGATGCAAATCATCCGTAACGGCGGCTTTGGCAATGGCGGTACAAACTTCGATGCCAAGACCCGTCGTAACICCACCGACCTGGAAGACATCTTCATTGCCCACATCGCCGGTATGGACGTGATGGCACGTGCACTGGAAAGTGCAGCCAAATTGCTTGAAGAGTCTCCTTACAAGAAGATGCTTGCCGACCGCTATGCTTCCTTCGACAGTGGTAAAGGCAAGGAATTCGAAGACGGCAAGCTGACACTGGAGGATTTGGCAGCTTACGCAAAAGCCAACGGTGAGCCGAAACAGACCAGCGGCAAGCAGGGATTGTATGAGGCAATCGTAAATATGTACTGCTGA 5750MI4_Bacteroides Amino   28MATKEYFPGIGKIKFEGKESKNPMAFRYYDADKVIMGKKMSEWLKFAMAWWHTLCAEGGD 003 AcidQFGGGTKKFPWNGEADKVQAAKNKMDAGFEFMQKMGIEYYCFHDVDLCEEAETIEEYEANLKEIVAYAKQKQAETGIKLLWGTANVFGHARYMNGAATNPDFDVVARAAVQIKSAIDATIELGGSNYVFWGGREGYMSLLNTDQKREKEHLAQMLTIARDYARARGFKGTFLIEPKPMEPTKHQYDVDTETVIGFLKAHNLDKDFKVNIEVNHATLAGHTFEHELAVAVDNGMLGSIDANRGDYQNGWDTDQFPIDNFELTQAMMQIIRNGGFGNGGTNFDAKTRRNSTDLEDIFIAHIAGMDVMARALESAAKLLEESPYKKMLADRYASFDSGKGKEFEDGKLTLEDLAAYAKANGEPKQTSGKQGLYEAIVNMYC 5751MI4_ Bacteroides DNA  29ATGACAAAAGAGTATTTTCCAACCATTGGTAAAATTCAGTTTGAAGGTAAAGAGAGTAAG 002AATCCATTAGCATATCGTTATTACGATGCTAACAAAGTAATAATGGGTAAAAAGATGAGCGAATGGCTCAAGTTTGCAATGGCATGGTGGCACACTTTGTGTGCTGAGGGTAGCGACCAGTTTGGTCCTGGCACCAAGTCATTCCCATGGAACGCATCAACCGACCGTATGCAGGCTGCAAAAGATAAGGCTGACGCAGGCTTCGAAATCATGCAAAAACTGGGCATCGAATACTACTGTTTCCATGATGTTGACCTCATCGACCCAGCAGACGATATTCCAACATACGAAAAGAATCTCAAGGAAATCGTTGCATACCTCAAGCAAAAACAGGCCGAGACAGGTATCAAATTGCTATGGGGTACAGCTAACGTATTTGGCCACAAGCGTTATATGAACGGTGCATCTACCAATCCTGACTTTGACGTTGTTGCACGAGCTATCGTGCAAATCAAGAATGCTATCGATGCAACAATCGAACTGGGCGGCACGAACTACGTATTCTGGGGTGGTCGCGAAGGTTACATGTCACTGCTCAACACCGACCAAAAGCGCGAGAAAGAGCACATGGCTACCATGTTAGGAATGGCACGTGACTATGCACGTTCTAAAGGCTTTACTGGTACTCTCCTTATCGAGCCAAAGCCTATGGAACCAACTAAGCATCAATACGACGTCGATACAGAAACTGTTATTGGTTTCCTCAAAGCTCACGGATTAGACAAGGACTTCAAGGTAAATATCGAAGTGAACCACGCTACATTGGCTGGCCATACCTTCGAACATGAATTAGCATGTGCTGTTGATGCAGGTATGCTTGGTTCCATCGATGCTAACCGTGGTGATATGCAGAATGGCTGGGATACAGATCAGTTCCCTATCAACAATTACGAGCTCGTTCAGGCCATGATGCAGATTATCCGCAATGGTGGTTTCGGTAACGGTGGTACAAACTTCGACGCTAAGACACGTCGTAATTCAACCGATTTGGAAGACATCATCATTGCTCACGTTTCAGCTATGGATGCTATGGCACGTGCTCTTGAATGTGCTGCAGACATTCTTCAAAACTCACCTATTCCACAGATGGTGGCCAACCGTTATGCAAGTTTTGACAAGGGTATAGGTAAAGATTTCGAAGACGGCAAGCTCACCCTCGAGCAAGTATACGAATATGGTAAGACCGTCGGCGAACCAGCTATTACAAGCGGCAAACAGGAGCTCTACGAAGCTATCGTTAATATGTATTGCTGA 5751MI4_Bacteroides Amino   30MTKEYFPTIGKIQFEGKESKNPLAYRYYDANKVIMGKKMSEWLKFAMAWWHTLCAEGSDQ 002 AcidFGPGTKSFPWNASTDRMQAAKDKADAGFEIMQKLGIEYYCFHDVDLIDPADDIPTYEKNLKEIVAYLKQKQAETGIKLLWGTANVFGHKRYMNGASTNPDFDVVARAIVQIKNAIDATIELGGTNYVFWGGREGYMSLLNTDQKREKEHMATMLGMARDYARSKGFTGTLLIEPKPMEPTKHQYDVDTETVIGFLKAHGLDKDFKVNIEVNHATLAGHTFEHELACAVDAGMLGSIDANRGDMQNGWDTDQFPINNYELVQAMMQIIRNGGFGNGGTNFDAKTRRNSTDLEDIIIAHVSAMDAMARALECAADILQNSPIPQMVANRYASFDKGIGKDFEDGKLTLEQVYEYGKTVGEPAITSGKQELYEAIVNMYC 5751MI5_ Bacteroides DNA  31ATGGCTAACAAAGAATTTTTCCCCGGTATTGGTAAAATCAAATTCGAAGGTAAAGAGAGC 003AAGAACCCCATGGCATATCGTTACTACGATGCTGAGAAGGTAGTCCTTGGCAAGAATATGAAAGACTGGTTCAAGTTTGCGATGGCTTGGTGGCACACATTGTGCGCCGAGGGTAGCGACCAGTTTGGTCCCGGCACTAAGTCTTTCCCCTGGAACACCGCAGAGTGCCCCATGCAGGCAGCTAAGGACAAGGTTGACGCTGGCTTCGAGTTCATGACCAAGATGGGTATTGAATACTTCTGCTTCCACGATGTAGACCTCGTTGCCGAGGCCGACACTGTTGAGGAGTACGAGGCTCGCATGAAGGAAATCGTTGCTTACATCAAGGAGAAGGTGGCCGAGACTGGCATCAAGAACCTGTGGGGTACAGCTAACGTATTTGGCAACAAGCGCTACATGAACGGTGCTGCTACTAACCCCGACTTTGACGTTGTGGCTCGCGCTATCGTTCAAATCAAGAACGCTATCGACGCTACTATCGAGCTCGGTGGTACGTCATACGTATTCTGGGGCGGCCGCGAGGGTTACATGAGCCTCTTGAACACCGACCAGAAGCGTGAGAAAGAGCACCTGGCTACTATGCTCACTATGGCACGCGACTACGCTCGCGCTAAGGGTTTCAAGGGTACATTCCTCATCGAGCCCAAGCCCATGGAGCCCACAAAGCACCAGTACGATGTTGACACTGAGACTGTAATCGGCTTCCTTAAGGCACACAACCTTGACAAGGACTTCAAGGTTAACATTGAGGTTAACCACGCAACTCTCGCTGGTCACACATTTGAGCACGAGCTCGCTTGTGCTGTTGACGCTGGCATGCTTGGCAGCATCGACGCTAACCGCGGTGACTACCAGAACGGCTGGGATACTGACCAATTCCCCATCGACAACTTCGACCTCACTCAAGCTATGCTCGAGATCATCCGCAACGATGGTTTCAAGGATGGTGGTACAAACTTCGACGCTAAGACTCGCCGCAACAGCACCGACCTCGAGGATATCTTCATCGCACACATCGCTGCTATGGACGCTATGGCACGTGCTCTCGAGAGCGCTGCTGCAGTACTCGAGGAGTCAGCTCTGCCCCAAATGAAGAAGGACCGCTATGCATCGTTCGACGCTGGCATGGGTAAGGACTTCGAGGACGGCAAGCTCACCCTGGAGCAAGTTTACGAGTATGGTAAGAAGGTGGGCGAGCCCAAGCAGACTAGCGGCAAGCAAGAGCTGTATGAGGCTATCCTCAACATGTACGTATAA 5751MI5_Bacteroides Amino   32MANKEFFPGIGKIKFEGKESKNPMAYRYYDAEKVVLGKNMKDWFKFAMAWWHTLCAEGSD 003 AcidQFGPGTKSFPWNTAECPMQAAKDKVDAGFEFMTKMGIEYFCFHDVDLVAEADTVEEYEARMKEIVAYIKEKVAETGIKNLWGTANVFGNKRYMNGAATNPDFDVVARAIVQIKNAIDATIELGGTSYVFWGGREGYMSLLNTDQKREKEHLATMLTMARDYARAKGFKGTFLIEPKPMEPTKHQYDVDTETVIGFLKAHNLDKDFKVNIEVNHATLAGHTFEHELACAVDAGMLGSIDANRGDYQNGWDTDQFPIDNFDLTQAMLEIIRNDGFKDGGTNFDAKTRRNSTDLEDIFIAHIAAMDAMARALESAAAVLEESALPQMKKDRYASFDAGMGKDFEDGKLTLEQVYEYGKKVGEPKQTSGKQELYEAILNMYV 5751MI6_ Bacteroides DNA  33ATGGCTAACAAAGAATTTTTCCCAGGTATTGGTAAAATCAAATTCGAAGGCAAAGAAAGC 004AAGAACCCCATGGCATATCGTCACTACGATGCCGAGAAGGTAGTCCTTGGTAAGAAGATGAAGGACTGGTTCAAGTTTGCGATGGCTTGGTGGCACACTCTGTGCGCCGAGGGTAGCGACCAGTTCGGCCCCGTGACCAAGTCTTTCCCCTGGAACCAGGCCGAGTGCCCCATGCAGGCTGCTAAGGACAAGGTTGACGCCGGCTTCGAGTTCATGACCAAGATGGGTATCGAATACTTCTGTTTCCACGATGTAGACCTCGTTGCCGAGGCCGACACCGTTGAGGAGTACGAAGCTCGCATGAAGGAAATCGTGGCTTACATCAAGGAGAAGATGGCCGAGACCGGCATCAAGAACCTGTGGGGTACAGCCAACGTATTCGGCAACAAGCGCTACATGAACGGTGCTGCCACCAACCCCGACTTTGACGTTGTGGCTCGCGCAATCGTTCAGATCAAGAACGCCATCGACGCTACTATCGAGCTCGGCGGTACCTCTTACGTGTTCTGGGGCGGCCGCGAGGGTTACATGACTCTCTTGAACACCGACCAGAAGCGCGAGAAGGAGCACCTGGCTACCATGCTCACCATGGCTCGCGACTATGCTCGCGCTAAGGGCTTCAAGGGTACATTCCTTATCGAGCCCAAGCCCATGGAGCCCACCAAGCACCAGTATGACGTGGATACCGAGACCGTTATCGGCTTCCTCAAGGCTCACGGCCTGGACAAGGACTTCAAGGTGAACATCGAGGTTAACCATGCAACTCTCGCCGGCCACACATTCGAGCACGAACTCGCTTGCGCTGTTGACGCTGGCATGCTGGGCAGCATCGACGCTAACCGCGGCGACTACCAGAACGGCTGGGATACCGACCAGTTCCCCATCGACAACTTCGACCTCACTCAGGCTATGCTCGAGATCATCCGCAACGGTGGTTTCAAGGACGGTGGTACAAACTTCGACGCTAAGACCCGTCGCAACAGCACCGATCTTGAGGACATCTTCATCGCTCACATCGCTGCTATGGACGCAATGGCACGCGCGCTCGAGAGCGCTGCCGCTGTGCTCGAGCAGAGCCCCCTTCCCCAGATGAAGAAAGACCGCTACGCATCGTTCGATGCCGGCATGGGCAAGGACTTCGAGGACGGCAAGCTCACTCTGGAGCAGGTTTACGAGTATGGTAAGAAGGTAGGCGAGCCCAAGCAGACCAGCGGCAAGCAGGAACTGTACGAGGCTATCCTCAACATGTATGTATAA 5751MI6_Bacteroides Amino   34MANKEFFPGIGKIKFEGKESKNPMAYRHYDAEKVVLGKKMKDWFKFAMAWWHTLCAEGSD 004 AcidQFGPVTKSFPWNQAECPMQAAKDKVDAGFEFMTKMGIEYFCFHDVDLVAEADTVEEYEARMKEIVAYIKEKMAETGIKNLWGTANVFGNKRYMNGAATNPDFDVVARAIVQIKNAIDATIELGGTSYVFWGGREGYMTLLNTDQKREKEHLATMLTMARDYARAKGFKGTFLIEPKPMEPTKHQYDVDTETVIGFLKAHGLDKDFKVNIEVNHATLAGHTFEHELACAVDAGMLGSIDANRGDYQNGWDTDQFPIDNFDLTQAMLEIIRNGGFKDGGTNFDAKTRRNSTDLEDIFIAHIAAMDAMARALESAAAVLEQSPLPQMKKDRYASFDAGMGKDFEDGKLTLEQVYEYGKKVGEPKQTSGKQELYEAILNMYV 5586MI22_ Clostridiales DNA  35ATGAAAGAATATTTTCCTATGACAAAAAAAGTTGAATATGAGGGCGCAGCATCTAAAAAT 003CCATTTGCGTTTAAATACTATGATGCCGAAAGAATTATAGCAGGCAAGCCTATGAAAGAACATCTTAAATTTGCTATGAGTTGGTGGCATACACTTTGTGCGGGCGGTGCAGACCCATTTGGCACAACAACTATGGACAGAACATACGGCGGACTTACCGACCCAATGGAAATTGCAAAGGCAAAAGTAGATGCAGGCTTTGAGTTTATGCAAAAACTCGGTATAGAGTATTTTTGTTTTCACGATGCGGATATTGCACCGGAAGGAAGCAGTTTTGTTGAAACAAAGAAAAACTTTTGGGAAATAGTAGATTATATACAGCAAAAGATGAATGAAACAGGCATAAAGTTGCTTTGGGGTACTGCAAACTGCTTTAATGCTCCACGTTATATGCACGGTGCAGGAACATCATGCAATGCGCACAGTTTTGCATATGCAGCCGCACAGATAAAAAATGCAATTGAAGCTACCGTTAAACTGGGTGGAAAAGGCTATGTTTTCTGGGGCGGAAGAGAGGGTTATGAAACACTTCTCAATACGGATATGGCACTTGAACTTGACAATATGGCAAGACTTATGCATATGGCAGTTGATTATGGCAGAAGCATTGGTTTTGACGGTGATTTTTATATCGAACCAAAGCCAAAGGAACCAACAAAACATCAATATGACTTTGACTCGGCAACTGTTTTGGGATTTTTGAGAAAGTACGGTTTAGATAAGGATTTTAAACTTAATATAGAGGCAAATCATGCGACACTTGCAGGTCATACATTTGAACATGAATTGACTGTAGCGCGTATAAACGGTGCATTTGGCAGCATAGATGCAAATAGCGGCGATCCCAATCTTGGCTGGGATACCGACCAATTCCCAACAGATGTTTATTCGGCAACCCTTTGTATGCTTGAAGTGATAAGAGCAGGCGGCTTTACAAACGGAGGTCTTAATTTTGATGCAAAGGTCAGAAGAGGCTCATTTACGTTTGATGACATTGTTTATGCATATATCAGCGGTATGGACACTTTTGCGCTGGGTTTTATAAAGGCATATGAAATAATTGAGGACGGCAGAATAGATGAATTTGTAAAAGAAAGATACGCAAGCTATAATACAGGCATAGGCAAAGATATTATAGATGGAAAGGCAAGCCTTGAAAGTTTGGAAGAATATATTCTTTCAAATGATAATGTTGTAATGCAAAGCGGCAGACAGGAATATCTTGAAACAGTTTTGAATAATATTTTGTTTAAAGCATAA 5586MI22_Clostridiales Amino   36MKEYFPMTKKVEYEGAASKNDFAFKYYDAERIIAGKPMKEHLKFAMSWWHTLCAGGADPF 003 AcidGTTTMDRTYGGLTDPMEIAKAKVDAGFEFMQKLGIEYFCFHDADIAPEGSSFVETKKNFWEIVDYIQQKMNETGIKLLWGTANCFNAPRYMHGAGTSCNAHSFAYAAAQIKNAIEATVKLGGKGYVFWGGREGYETLLNTDMALELDNMARLMHMAVDYGRSIGFDGDFYIEPKPKEPTKHQYDFDSATVLGFLRKYGLDKDFKLNIEANHATLAGHTFEHELTVARINGAFGSIDANSGDPNLGWDTDQFPTDVYSATLCMLEVIRAGGFTNGGLNFDAKVRRGSFTFDDIVYAYISGMDTFALGFIKAYEIIEDGRIDEFVKERYASYNTGIGKDIIDGKASLESLEEYILSNDNVVMQSGRQEYLETVLNNILFKA 1753MI4_ Firmicutes DNA  37ATGAAAGAAATTTTCCCAAATATTCCTGAGATTAAATTCGAAGGAAAAGACAGCAAAAAT 001CCTTTTGCTTTCCATTACTACAACCCAGACCAAATCATCTTAGGCAAACCAATGAAAGAACACCTCCCATTCGCTATGGCTTGGTGGCACAATCTTGGTGCAACAGGTGTTGATATGTTTGGCGCTGGCCCAGCTGATAAGAGTTTCGGTGCTAAAGTTGGCACAATGGAACACGCTAAGGCCAAAGTCGATGCCGGTTTCGAATTCATGAAGAAACTCGGTATCAGATATTTCTGCTTCCATGATGTTGACTTAGTTCCAGAATGTGCAGATATCAAAGATACAAACAAAGAATTAGATGAAATCAGTGACTACATCTTAGAAAAGATGAAAGGCACAGATATTAAGTGTTTATGGGGCACCGCCAATATGTTCTCTAACCCACGCTTCTGCAATGGIGCGGGTTCCACAAACAGTGCGGATGTCTTCGCTTTCGCCGCTGCTCAAGTTAAGAAAGCCTTAGATATCACCGTTAAATTAGGTGGTAGGGGTTACGTCTTCTGGGGTGGTCGTGAAGGTTACGAAACATTACTCAATACAGACGTTAAATTCGAACAAGAAAACATTGCTCGTTTAATGAAGATGGCTGTTGAATATGGCCGTTCCATCGGTTTCAAAGGCGATTTCTATATCGAACCAAAACCAAAAGAACCAATGAAACACCAATATGACTTCGACGCCGCTACAGCTATTGGCTTCTTAAGAGCCCACGGCTTAGACAAAGACTTCAAGTTGAACATCGAAGCTAACCACGCTACATTAGCGGGTCATACATTCCAACACGATTTAAGAATCTCCGCCATTAATGGTATGTTAGGTTCTATCGATGCTAACCAAGGCGATATGCTCTTAGGTTGGGATACAGACGAATTCCCATTTGATGTCTACAGTGCGACACAATGTATGTACGAAGTCTTAAAGAATGGTGGTCTTACAGGTGGTTTCAACTTTGACTCCAAAACACGTCGTCCATCCTACACAATGGAAGATATGTTCTTAGCCTATATCTTAGGTATGGATACATTCGCTTTAGGTTTAATCAAAGCTGCTCAAATCATCGAAGATGGCCGTATTGATCAATTCATCGAAAAGAAATATTCTTCCTTCCGTGAAACAGAAATCGGTCAAAAGATCTTAAACAACAAGACAAGCTTAAAAGAATTATCCGATTACGCTTGCAAGATGGGTGCTCCAGAACTTCCAGGTAGTGGTCGTCAAGAAATGCTCGAAGCCATCGTTAACGATGTCTTATTCGGCAAG TAA1753MI4_ Firmicutes Amino   38MKEIFPNIPEIKFEGKDSKNPFAFHYYNPDQIILGKPMKEHLPFAMAWWHNLGATGVDMF 001 AcidGAGPADKSFGAKVGTMEHAKAKVDAGFEFMKKLGIRYFCFHDVDLVPECADIKDTNKELDEISDYILEKMKGTDIKCLWGTANMFSNPRFCNGAGSTNSADVFAFAAAQVKKALDITVKLGGRGYVFWGGREGYETLLNTDVKFEQENIARLMKMAVEYGRSIGFKGDFYIEPKPKEPMKHQYDFDAATAIGFLRAHGLDKDFKLNIEANHATLAGHTFQHDLRISAINGMLGSIDANQGDMLLGWDTDEFPFDVYSATQCMYEVLKNGGLTGGFNFDSKTRRPSYTMEDMFLAYILGMDTFALGLIKAAQIIEDGRIDQFIEKKYSSFRETEIGQKILNNKTSLKELSDYACKMGAPELPGSGRQEMLEAIVNDVLFGK 1753MI6_ Firmicutes DNA  39ATGAAAGAAATTTTCCCAAATATTCCTGAGATTAAATTCGAAGGAAAAGACAGCAAAAAT 001CCTTTTGCTTTCCATTACTACAACCCAGACCAAATCATCTTAGGTAAACCAATGAAAGAACACCTCCCATTCGCTATGGCTTGGTGGCACAATCTTGGTGCAACAGGTGTTGATATGTTTGGCGCTGGCCCAGCTGATAAGAGTTTCGGTGCTAAAGTTGGCACAATGGAACACGCTAAGGCCAAAGTCGATGCCGGTTTCGAATTCATGAAGAAACTTGGTATCAGATATTTCTGCTTCCATGATGTTGACTTAGTTCCAGAATGTGCAGATATCAAAGATACAAACAAAGAATTAGATGAAATCAGTGACTACATCTTAGAAAAGATGAAAGGCACAGATATCAAGTGTTTATGGGGCACCGCCAATATGTTCTCTAACCCACGTTTCTGCAATGGTGCGGGTTCCACAAACAGTGCGGATGTCTTCGCTTTCGCCGCTGCTCAAGTTAAGAAAGCCTTAGATATCACCGTTAAATTAGGTGGTAGGGGTTACGTCTTCTGGGGTGGTCGTGAAGGTTACGAAACATTACTCAATACAGACGTTAAATTCGAACAAGAAAACATTGCTCGTTTAATGAAGATGGCTGTTGAATATGGCCGTTCCATCGGTTTCAAAGGCGATTTCTATATCGAACCAAAACCAAAAGAACCAATGAAACACCAATATGACTTCGACGCCGCTACAGCTATTGGCTTCTTAAGAGCCCACGGCTTAGACAAAGACTTCAAGTTGAACATCGAAGCTAACCACGCTACATTAGCGGGTCATACATTCCAACACGATTTAAGAATCTCCGCCATTAATGGTATGTTAGGTTCTATCGATGCTAACCAAGGCGATATGCTCTTAGGTTGGGATACAGACGAATTCCCATTTGATGTCTACAGTGCGACACAATGTATGTACGAAGTCTTAAAGAATGGTGGTCTTACAGGTGGTTTCAACTTTGACTCCAAAACACGTCGTCCATCCTACACAATGGAAGATATGTTCTTAGCCTATATCTTAGGTATGGATACATTCGCTTTAGGTTTAATCAAAGCTGCTCAAATCATCGAAGATGGCCGTATTGATCAATTCATCGAAAAGAAATATTCTTCCTTCCGTGAAACAGAAATCGGTCAAAAGATCTTAAACAACAAGACAAGCTTAAAAGAATTATCCGATTACGCTTGCAAGATGGGTGCTCCAGAACTTCCAGGTAGTGGTCGTCAAGAAATGCTCGAAGCCATCGTTAACGATGTCTTATTCGGCAAG TAA1753MI6_ Firmicutes Amino   40MKEIFPNIPEIKFEGKDSKNPFAFHYYNPDQIILGKDMKEHLPFAMAWWHNLGATGVDMF 001 AcidGAGPADKSFGAKVGTMEHAKAKVDAGFEFMKKLGIRYFCFHDVDLVPECADIKDTNKELDEISDYILEKMKGTDIKCLWGTANMFSNPRFCNGAGSTNSADVFAFAAAQVKKALDITVKLGGRGYVFWGGREGYETLLNTDVKFEQENIARLMKMAVEYGRSIGFKGDFYIEPKPKEPMKHQYDFDAATAIGFLRAHGLDKDFKLNIEANHATLAGHTFQHDLRISAINGMLGSIDANQGDMLLGWDTDEFPFDVYSATQCMYEVLKNGGLTGGFNFDSKTRRDSYTMEDMFLAYILGMDTFALGLIKAAQIIEDGRIDQFIEKKYSSFRETEIGQKILNNKTSLKELSDYACKMGAPELPGSGRQEMLEAIVNDVLFGK 1753MI35_ Firmicutes DNA  41ATGGAATATTTCCCTTTCGTCAAATCGGTCCAATACAAGGGACCAACCTCAACTGAACCA 004TTCGCTTTCAAGTACTACGATGCCAACCGTGTCGTTCTTGGAAAACCAATGAAAGAATGGATGCCATTCGCTATGGCTTGGTGGCACAACCTCGGCGCTGCCGGTACCGACATGTTCGGCGGCAACACCATGGACAAGTCCTGGGGAGTCGATAAAGAAAAAGACCCAATGGGCTATGCCAAAGCCAAAGTTGATGCCGGCTTCGAATTCATGCAGAAGATGGGCATCGAATACTACTGCTTCCACGATGTCGACCTCGTCCCAGAGTGCGACGACATCACCGTTATGTACCAGAGACTCGATGAGATCGGTGATTACCTTCTCAAGAAACAGAAGGAAACCGGTATCAAGCTTCTTTGGTCAACCGCCAATGCCTTCGGACACCGCCGTTTCATGAACGGTGCTGGTTCCAGCAACTCCGCCGAAGTCTATTGCTTCGCCGCCGCCCAGATCAAGAAAGCTCTTGAGCTCTGCGTCAAACTCGGTGGCAAAGGCTATGTCTTCTGGGGTGGACGTGAAGGCTACGAAACCCTTCTCAACACCGACATGAAGTTCGAACAAGAGAACATCGCCAACCTTATGAGATGCGCCCGTGACTACGGCCGCAAGATCGGTTTCAAAGGCGACTTCTACATCGAACCAAAACCAAAAGAGCCAACAAAGCATCAGTATGACTTCGACGCCGCTACCGCCATCGGATTCCTCCGTCAGTACGGTCTCGACAAAGACTTCAAGATGAACATCGAAGCCAACCACGCTACCTTAGCTGGCCACACCTTCGAACACGAACTCCGCGTCTCCGCCATGAACGGCATGCTCGGTTCCATCGACGCCAACGAAGGCGATATGCTCCTCGGATGGGATGTCGACCGTTTCCCAGCCAACGTCTATAGCGCCACCTTCGCCATGCTCGAAGTCATCAAAGCCGGTGGACTTACCGGTGGCTTCAACTTCGACGCCAAGACCCGCCGCGCTTCCAACACCTATGAAGATATGTTCAAGGCTTTCGTCCTTGGTATGGATACCTTCGCTTTAGGTCTTCTCAATGCCGAAGCCATCATCAAAGACGGCCGCATCGACAAGTTCGTCGAGGATAGATATGCCAGCTTCAAGACCGGCATCGGTGCTAAGGTCCGCGATCACTCCGCTACCCTTGAGGATTTAGCTGCCCACGCCCTTGAGACCAAGGTTTGCCCAGATCCAGGCAGCGGCGACGAGGAAGAACTCCAGGAAATCCTCAACCAGTTAATGTTCGGTAAG AAATAA1753MI35_ Firmicutes Amino   42MEYFPFVKSVQYKGPTSTEPFAFKYYDANRVVLGKPMKEWMPFAMAWWHNLGAAGTDMFG 004 AcidGNTMDKSWGVDKEKDPMGYAKAKVDAGFEFMQKMGIEYYCFHDVDLVPECDDITVMYQRLDEIGDYLLKKQKETGIKLLWSTANAFGHRRFMNGAGSSNSAEVYCFAAAQIKKALELCVKLGGKGYVFWGGREGYETLLNTDMKFEQENIANLMRCARDYGRKIGFKGDFYIEPKPKEPTKHQYDFDAATAIGFLRQYGLDKDFKMNIEANHATLAGHTFEHELRVSAMNGMLGSIDANEGDMLLGWDVDRFPANVYSATFAMLEVIKAGGLTGGFNFDAKTRRASNTYEDMFKAFVLGMDTFALGLLNAEAIIKDGRIDKFVEDRYASFKTGIGAKVRDHSATLEDLAAHALETKVCPDPGSGDEEELQEILNQLMFGKK 1754MI9_ Firmicutes DNA  43ATGAGCGAATTTTTTAAGAATATTCCAGAGATTAAATTCGAAGGAAAAGATAGTAAAAAT 004CCATGGGCATTCAAGTATTACAATCCTGAATTGACCATTATGGGTAAAAAAATGTCTGAACATCTTCCTTTTGCAATGGCCTGGTGGCATAACCTTGGCGCAAATGGAGTTGATATGTTCGGTTCGGGAACCGCCGATAAATCTTTCGGTCAGGCTCCGGGAACTATGGAGCACGCAAAGGCTAAGGTAGATGCAGGTATCGAGTTTATGAAGAAACTCGGAATCAAGTACTACTGCTGGCATGATGTAGACCTTGTTCCTGAAGATCCAAACGATATCAACGTAACAAACAAGCGCCTTGATGAGATTTCAGATTATATCCTTGAAAAAACAAAGGGAACTGACATCAAGTGTCTCTGGGGAACTGCTAACATGTTCAGTAATCCCCGCTTTATGAACGGGGCAGGCTCAACAAACTCTGCTGACGTTTACTGCTTTGCAGCTGCCCAGGTTAAAAAGGCTCTTGAGATTACCGTAAAGCTTGGTGGCCGCGGTTATGTATTCTGGGGTGGACGCGAAGGTTATGAAACTCTTCTTAATACAGATGTAAAGCTTGAACAGGAAAATATTGCAAACCTTATGCACATGGCAGTTGATTATGGCCGTTCAATCGGTTTCAAGGGAGACTTCTACATCGAGCCTAAGCCAAAGGAGCCGATGAGTCATCAGTATGATTTTGATGCCGCAACTGCAATCGGCTTCCTCCGCCAGTATGGCCTCGACAAAGACTTTAAGATGAACATTGAGGCTAACCACGCTTCTCTTGCAAATCATACCTTCCAGCATGAGCTTTATATCAGCCGCATTAACGGAATGCTTGGTTCTGTAGATGCTAACCAGGGAAATCCAATTCTCGGCTGGGATACAGATAACTTCCCTTGGAATGTCTACGACGCAACTCTTGCAATGTACGAAGTACTCAAGGCTGGTGGACTTACAGGTGGCTTCAACTTTGACTCAAAGAACCGCCGCCCATCAAATACATTTGAAGATATGTTCCACGCTTACATCATGGGAATGGACACTTTTGCTCTTGGTCTTATTAAGGCTGCAGAAATTATTGAAGACGGAAGAATCGATGGCTTCATTAAAGAAAAGTATTCAAGCTACGAAAGTGGAATTGGTAAGAAGATCCGCGACAAGCAGACAACTTTGGAAGAGCTTGCTGCCCGTGCCGCAGAAATGAAAAAGCCATCTGATCCAGGTTCAGGCCGCGAGGAATATCTGGAAGGAGTTGTTAACAATATCCTCTTTCGCGGA TAA1754MI9_ Firmicutes Amino   44MSEFFKNIPEIKFEGKDSKNPWAFKYYNPELTIMGKKMSEHLPFAMAWWHNLGANGVDMF 004 AcidGSGTADKSFGQAPGTMEHAKAKVDAGIEFMKKLGIKYYCWHDVDINPEDPNDINVTNKRLDEISDYILEKTKGTDIKCLWGTANMFSNPRFMNGAGSTNSADVYCFAAAQVKKALEITVKLGGRGYVFWGGREGYETLLNTDVKLEQENIANLMHMAVDYGRSIGFKGDFYIEPKPKEPMSHQYDFDAATAIGFLRQYGLDKDFKMNIEANHASLANHTFQHELYISRINGMLGSVDANQGNPILGWDTDNFPWNVYDATLAMYEVLKAGGLTGGFNFDSKNRRPSNTFEDMFHAYIMGMDTFALGLIKAAEIIEDGRIDGFIKEKYSSYESGIGKKIRDKQTTLEELAARAAEMKKPSDPGSGREEYLEGVVNNILFRG 1754MI22_ Firmicutes DNA  45ATGAGCGAGTTTTTTAAGAATATTCCTCAAATAAAATACGAAGGAAAAGATAGCAAAAAT 004CCCTGGGCATTCAAGTATTACAATCCTGAATTGACAATCATGGGTAAAAAGATGAGCGAACATCTTCCATTCGCAATGGCATGGTGGCATAACCTTGGCGCAAACGGCGTTGATATGTTTGGTCAGGGAACAGCAGACAAGTCTTTCGGACAGATTCCTGGAACTATGGAGCATGCAAAGGCTAAGGTTGATGCTGGTATAGAGTTTATGAAGAAGCTCGGAATCAAATATTACTGCTGGCACGATGTTGACCTTGTTCCTGAGGATCCAAACGATATCAACGTAACTAACAAACGTCTGGACGAAATTTCAGATTACATCCTTGAAAAGACAAAAGGAACAGACATTAAGTGTCTCTGGGGAACTGCAAACATGTTCGGTAACCCTCGCTTTATGAACGGTGCAGGCTCTACAAACTCTGCTGACGTTTACTGTTTTGCTGCCGCTCAGGTAAAAAAGGCTCTTGAGATTACTGTAAAGCTTGGTGGCCGAGGTTATGTTTTCTGGGGTGGCCGCGAAGGTTACGAAACTCTTCTCAATACAGACGTAAAACTTGAACAGGAAAATATCGCAAACCTCATGCATATGGCTGTTGATTATGGCCGCTCAATCGGTTTCAAGGGAGACTTCTACATCGAGCCTAAGCCAAAGGAGCCAATGAGCCATCAGTATGATTTTGATGCTGCAACAGCAATCGGCTTCCTCCGCCAGTATGGCCTCGACAAAGATTTTAAGATGAACATCGAAGCTAACCATGCCTCACTTGCAAATCACACCTTCCAGCACGAGCTTTGTATCAGCCGCATAAACGGAATGCTTGGTTCTGTAGATGCAAATCAGGGAAATCCAATTCTTGGCTGGGATACAGATAACTTCCCATGGAATGTTTACGATGCAACTCTGGCAATGTACGAAGTTCTCAAGGCTGGCGGTCTAACAGGTGGCTTCAACTTTGACTCAAAGAACCGICGCCCATCAAATACTTTTGAAGAIATGTTCCACGCTTATATCATGGGTATGGATACTTTTGCCCTTGGCCTTATTAAGGCTGCAGAAATTATTGAAGACGGCAGAATTGACGGCTTCATCAAAGAAAAGTATTCAAGCTTTGAAAGTGGAATTGGTAAGAAGATTCGTGACAAGCAGACAAGTTTGGAAGAGCTTGCAGCTCGTGCCGCTGAAATGAAAAAGCCATCTGATCCAGGTTCAGGCCGCGAGGAATACCTCGAAGGAGTTGTTAACAACATCCTCTTTCGCGGA TAA1754MI22_ Firmicutes Amino   46MSEFFKNIPQIKYEGKDSKNPWAFKYYNPELTIMGKKMSEHLPFAMAWWHNLGANGVDMF 004 AcidGQGTADKSFGQIPGTMEHAKAKVDAGIEFMKKLGIKYYCWHDVDLVPEDPNDINVTNKRLDEISDYILEKTKGTDIKCLWGTANMFGNPRFMNGAGSTNSADVYCFAAAQVKKALEITVKLGGRGYVFWGGREGYETLLNTDVKLEQENIANLMHMAVDYGRSIGFKGDFYIEPKPKEPMSHQYDFDAATAIGFLRQYGLDKDFKMNIEANHASLANHTFQHELCISRINGMLGSVDANQGNPILGWDTDNFPWNVYDATLAMYEVLKAGGLTGGFNFDSKNRRDSNTFEDMFHAYIMGMDTFALGLIKAAEIIEPGRIDGFIKEKYSSFESGIGKKIRDKQTSLEELAARAAEMKKPSDRGSGREEYLEGVVNNILFRG 727MI1_ Firmicutes DNA  47ATGATATTTGAAAATATTCCCGCAATTCCTTATGAGGGTCCGAAGAGCACAAATCCGCTG 002GCGTTTAAATTCTATGATCCGGACAAGATCGTTATGGGAAAGCCCATGAAGGAGCATCTGCCCTTTGCAATGGCCTGGTGGCACAACCTTGGCGCGGCCGGAACCGATATGTTCGGGCGCGATACCGCCGACAAATCCTTCGGTGCGGTAAAAGGCACAATGGAGCATGCCAAAGCGAAAGTCGATGCCGGCTTTGAGTTCATGCAGAAGCTGGGGATCCGCTATTTCTGCTTCCATGATGTGGATCTTGTTCCGGAGGCGGATGATATAAAGGAGACCAACCGCCGTCTGGACGAGATCAGCGATTACATCCTTGAAAAGATGAAGGGCACCGATATCAAGTGCCTTTGGGGCACGGCCAATATGTTCTCAAATCCGCGCTTTATGAACGGCGCAGGCTCCTCCAATTCTGCCGATGTATTCGCTTTTGCGGCAGCACAGGCCAAGAAGGCCTTGGATCTGACCGTCAAACTCGGCGGGCGCGGCTATGTCTTCTGGGGCGGACGTGAGGGCTATGAGACACTTCTCAATACCGACATGAAGTTCGAGCAGGAGAATATCGCGAAGCTCATGCATATGGCTGTCGATTACGGCCGCAGCATAGGCTTTACCGGTGATTTCTATATCGAGCCCAAACCGAAAGAGCCGATGAAACACCAGTATGATTTCGATGCAGCCACTGCGATAGGCTTCCTCCGCCAGTACGGACTCGATAAGGACTTCAAGCTCAACATCGAGGCAAACCACGCCACACTGGCAGGTCACACTTTCCAGCACGATCTGCGTGTTTCCGCAATAAACGGAATGCTGGGCAGCATTGACGCCAACCAGGGCGATATGCTCCTCGGCTGGGATACCGACGAGTTCCCGTTCAATGTATATGATGCGACCATGTGCATGTATGAGGTGCTCAAGTCAGACGGGCTCACCGGCGGCTTTAACTTCGACTCCAAATCACGCCGCCCGAGCTATACGGTCGAGGATATGTTTACAAGCTATATCCTCGGCATGGACACTTTTGCCCTCGGCCTTCTGAAAGCGGCCGAGCTTATCGAAGACGGAAGGCTTGACGCCTTCGTCAAAGAACGCTATTCAAGCTATGAGAGCGGCATCGGCGCAAAGATCCGCAGCGGAGAAACCGATTTGAAGGAATTGGCGGAATATGCGGACTCCCTCGGAGCCCCCGAACTTCCGGGCAGCGGAAAACAGGAACAGCTCGAGAGCATAGTAAATCAGATACTTTTCGGATAA 727MI1_ FirmicutesAmino   48 MIFENIPAIPYEGPKSTNPLAFKFTDPDKIVMGKPMKEHLPFAMAWWHNLGAAGTDMFGR002 Acid DTADKSFGAVKGTMEHAKAKVDAGFEFMQKLGIRYFCFHDVDLVPEADDIKETNRRLDEISDYILEKMKGTDIKCLWGTANMFSNPRFMNGAGSSNSADVFAFAAAQAKKALDLTVKLGGRGYVFWGGREGYETLLNTDMKFEQENIAKLMHMAVDYGRSIGFTGDFYIEPKPKEPMKHQYDFDAATAIGFLRQYGLDKDFKLNIEANHATLAGHTFQHDLRVSAINGMLGSIDANQGDMLLGWDTDEFPFNVYDATMCMYEVLKSDGLTGGFNFDSKSRRPSYTVEDMFTSYILGMDTFALGLLKAAELIEDGRLDAFVKERYSSYESGIGAKIRSGETDLKELAEYADSLGAPELPGSGKQEQLESIVNQILFG 727MI9_ Firmicutes DNA  49ATGAGCGAGTTTTTTGCCAGCATTCCCAAAATTCCCTTTGAAGGCAAGGACAGCGCCAAT 005CCCCTGGCGTTCAAATACTACGACGCCGACAGGATGATACTGGGCAAGCCCATGAAGGAGCACCTTCCCTTCGCCATGGCCTGGTGGCACAACCTGTGCGCCGCGGGCACCGATATGTTTGGCCGGGACACCGCCGACAAGTCCTTCGGCCAGGTCAAGGGCACCATGGAACACGCCAAGGCCAAGGTGGACGCGGGCTTTGAGTTCATGAAGAAGCTGGGCATCCGCTACTTCTGCTTCCACGACGTGGACATCGTGCCCGAAGCCGACGACATCAAGGAAACCAACCGCCGTCTGGACGAGATCTCCGACTATATCCTGGAGAAAATGAAAGGCACCGACATCCAGTGCCTGTGGGGCACCGCCAACATGTTCGGCAACCCCCGCTATATGAACGGCGCGGGCAGCTCCAACTCCGCCGACGTATACTGCTTCGCCGCGGCCCAGATCAAAAAGGCCCTGGACATCACCGTGAAGCTGGGCGGCAAGGGCTACGTGTTCTGGGGCGGCCGCGAGGGCTACGAGACCCTGCTGAACACCGATATGAAGTTCGAGCAGGAGAACATCGCCCGCCTGATGCACATGGCCGTGGACTACGGCCGCAGCATCGGCTTCACCGGCGATTTCTACATCGAGCCCAAGCCCAAGGAGCCCATGAAGCACCAGTACGACTTCGACGCCGCCACCGCCATAGGCTTTTTGCGCCAGTACGGCCTGGACAAGGATTTCAAGCTGAACATCGAGTCCAACCACGCCACCCTGGCGGGCCATACCTTCCAGCACGACCTGCGCGTTTCCGCCATCAACGGCATGCTGGGCTCCATCGACGCCAACCAGGGCGACTACCTGCTGGGCTGGGATACCGACGAGTTCCCCTACAGCGTATACGAGACCACCATGTGCATGTACGAGGTGCTCAAGGCCGGAGGTCTCACCGGCGGCTTCAATTTCGACGCCAAGAACCGCCGTCCCAGCTACACCCCCGAGGATATGTTCCACGCCTACATCCTTGGGATGGACAGCTTCGCCCTGGGCCTGATCAAGGCCGCCGAGCTCATCGAGGACGGTCGCCTGGACGCCTTCGTCCGGGACCGCTACCAGAGCTGGGAGACCGGCATCGGCGATAAGATCCGCAAGGGCGAGACCACACTGGCCGAGCTGGCCGAGTACGCCGCCCGGATGGGCGCGCCCGCGCTGCCCGGCAGCGGCCGCCAGGAATACCTGGAGGGCGTGGTCAACAATATCCTGTTCAAATAA 727MI9_Firmicutes Amino   50MSEFFASIPKIPFEGKDSANDLAFKYYDADRMILGKPMKEHLDFAMAWWHNLCAAGTDMF 005 AcidGRDTADKSFGQVKGTMEHAKAKVDAGFEFMKKLGIRYFCFHDVDIVPEADDIKETNRRLDEISDYILEKMKGTDIQCLWGTANMFGNPRYMNGAGSSNSADVYCFAAAQIKKALDITVKLGGKGYVFWGGREGYETLLNTDMKFEQENIARLMHMAVDYGRSIGFTGDFYIEPKPKEPMKHQYDFDAATAIGFLRQYGLDKDFKLNIESNHATLAGHTFQHDLRVSAINGMLGSIDANQGDYLLGWDTDEFPYSVYETTMCMYEVLKAGGLTGGFNFDAKNRRPSYTPEDMFHAYILGMDSFALGLIKAAELIEDGRLDAFVRDRYQSWETGIGEKIRKGETTLAELAEYAARMGAPALPGSGRQEYLEGVVNNILFK 727MI27_ Firmicutes DNA  51ATGAAGACCTATTTCAAAAAAATCCCCGTGATCCCCTACGAGGGACCGAAGTCCCAGAAT 002CCGCTGTCGTTCAAATTCTATGACGCGGACCGCATCGTTCTCGGCAAGCCCATGAAGGAGCATCTGCCCTTCGCCATGGCCTGGTGGCACAATCTGGGTGCTGCCGGAACGGACATGTTCGGCCGCGATACCGCCGACAAGTCCTTCGGAGCGGAGAAGGGCACCATGGAGCATGCCAAGGCCAAGGTGGACGCTGGCTTCGAGTTTATGAAGAAGGTGGGCATCCGGTATTTCTGCTTCCATGACGTGGATCTGGTCCCGGAAGCGGACGACATCAAGGAGACCAACCGCCGTCTCGATGAGATCAGCGACTACATCCTCAAGAAGATGAAGGGCACGGATATCAAGTGCCTCTGGGGCACCGCCAACATGTTCGGCAATCCCCGGTTCATGAACGGCGCGGGCAGCTCCAACAGCGCGGACGTGTTCTGCTTTGCCGCGGCCCAGGTGAAGAAGGCCTTGGACATCACCGTCAAGCTGGGCGGCCGGGGCTATGTGTTCTGGGGCGGCCGTGAGGGGTATGAGTCCCTGCTGAACACGGACGTGAAGTTTGAGCAGGAGAACATCGCCAAGCTCATGCACCTTGCCGTGGACTACGGCCGCAGCATCGGCTTCACCGGCGATTTCTACATCGAGCCCAAGCCCAAGGAGCCCATGAAGCACCAGTACGACTTCGATGCCGCCACCGCCATCGGCTTCCTCAGGCAGTACGGCCTCGATAAGGACTTCAAGATGAACATTGAAGCCAACCACGCGACCCTGGCCGGCCACACCTTCCAGCACGACCTCAGGATCAGCGCCATCAACGGGATGCTGGGCTCCATCGACGCCAACCAGGGCGACCTCCTGCTGGGATGGGACACCGACGAATTCCCCTTCAACGTCTATGAGGCCACCATGTGCATGTACGAGGTCCTCAAGGCCGGCGGCCTCACCGGCGGCTTCAACTTCGACTCAAAGAACCGCCGTCCCTCCTACACCATGGAGGATATGTTCCACGCCTACATCCTGGGCATGGACACCTTCGCCCTGGGTCTTCTCAAGGCCGCGGAGCTCATCGAGGACGGTCGGATCGACAAATTCGTGGAGGAGCGCTACGCCAGCTACAAGACCGGCATCGGCGCCAAGATCCGTTCCGGCGAGACCACGCTTCAGGAGCTGGCCGCCTATGCCGACAAGTTGGGCGCGCCTGCCCTTCCCGGCAGCGGCCGTCAGGAGTACCTGGAGAGCATCGTCAACCAGGTGCTCTTCGGGATGTGA 727MI27_Firmicutes Amino   52MKTYFKKIPVIPYEGPKSQNPLSFKFYDADRIVLGKPMKEHLPFAMAWWHNLGAAGTDMF 002 AcidGRDTADKSFGAEKGTMEHAKAKVDAGFEFMKKVGIRYFCFHDVDLVPEADDIKETNRRLDEISDYILKKMKGTDIKCLWGTANMFGNPRFMNGAGSSNSADVFCFAAAQVKKALDITVKLGGRGYVFWGGREGYESLLNTDVKFEQENIAKLMHLAVDYGRSIGFTGDFYIEPKPKEPMKHQYDFDAATAIGFLRQYGLDKDFKMNIEANHATLAGHTFQHDLRISAINGMLGSIDANQGDLLLGWDTDEFPFNVYEATMCMYEVLKAGGLTGGFNFDSKNRRPSYTMEDMFHAYILGMDTFALGLLKAAELIEDGRIDKFVEERYASYKTGIGAKIRSGETTLQELAAYADKLGAPALPGSGRQEYLESIVNQVLFGM 1753MI2_ Neocalli- DNA  53ATGGCTAAAGAGTATTTTCCAGAGATTGGCAAAATCAAGTTTGAAGGCAAGGACAGCAAA 006mastigales AACCCAATGGCTTTCCACTACTATGACCCCGAGAAGGTGATCATGGGCAAGCCTATGAAAGACTGGCTCCGCTTCGCTATGGCATGGTGGCACACCCTCTGCGCAGAAGGTGGCGACCAGTTCGGTGGCGGCACTAAGAAGTTCCCTTGGAACAACGGCGCTGACGCTGTAGAAATCGCAAAACAGAAGGCTGACGCAGGTTTCGAAATCATGCAGAAGCTCGGCATCCCATATTTCTGCTTCCACGACGTGGACCTCGTGTCTGAGGGCGCATCTGTAGAAGAGTATGAGGCTAACCTCAAGGCTATCACAGACTACCTCGCTGTGAAGATGAAGGAAACAGGCATCAAGCTCCTGTGGTCTACTGCCAACGTATTCGGCAACGGCCGCTACATGAACGGTGCTTCTACCAACCCTGACTTCGACGTCGTTGCTCGCGCTATCGTGCAGATTAAGAACGCTATCGACGCTGGTATCAAGCTCGGCGCTGAGAACTACGTGTTCTGGGGCGGACGCGAAGGCTACATGAGCCTCCTCAACACCGACCAGAAGCGTGAGAAGGAGCACATGGCCACTATGCTCACTATGGCTCGCGACTACGCTCGCGCTAAGGGCTTCAAGGGCACATTCCTCATCGAGCCTAAGCCAATGGAGCCTTCTAAGCACCAGTATGACGTTGACACTGAGACTGTCATCGGCTTCCTCAAGGCACACAACCTCGACAAGGACTTCAAGGTGAACATCGAGGTGAACCACGCAACTCTCGCTGGCCACACCTTCGAGCACGAGCTCGCAGTGGCAGTGGACAACAACATGCTCGGCTCTATCGACGCTAACCGTGGTGACTACCAGAATGGCTGGGATACTGACCAGTTCCCAATCGACCAGTACGAACTCGTTCAGGCTTGGATGGAAATCATCCGTGGCGGCGGTCTCGGCACTGGCGGCACGAACTTCGACGCTAAGACTCGTCGTAACTCTACCGACCTCGAAGACATCTTCATCGCACACATCGCAGGCATGGACGCTATGGCACGCGCACTCGAATCAGCTGCTAAGCTCCTCGAAGAGTCTCCATACAAGGCAATGAAGGCAGCTCGCTACGCTTCATTCGACAACGGTATCGGTAAGGACTTCGAAGATGGCAAGCTCACTCTCGAGCAGGCTTACGAATACGGTAAGAAGGTTGGTGAGCCTAAGCAGACTTCTGGCAAGCAGGAGCTCTACGAAGCCATCGTTGCAATGTACGCTTAA 1753MI2_Neocalli- Amino   54MAKEYFPEIGKIKFEGKDSKNPMAFHYYDPEKVIMGKPMKDWLRFAMAWWHTLCAEGGDQ 006mastigales AcidFGGGTKKFPWNNGADAVEIAKQKADAGFEIMQKLGIPYFCFHDVDLVSEGASVEEYEANLKAITDYLAVKMKETGIKLLWSTANVFGNGRYMNGASTNPDFDVVARAIVQIKNAIDAGIKLGAENYVFWGGREGYMSLLNTDQKREKEHMATMLTMARDYARAKGFKGTFLIEPKPMEPSKHQYDVDTETVIGFLKAHNLDKDFKVNIEVNHATLAGHTFEHELAVAVDNNMLGSIDANRGDYQNGWDTDQFPIDQYELVQAWMEIIRGGGLGTGGTNFDAKTRRNSTDLEDIFIAHIAGMDAMARALESAAKLLEESPYKAMKAARYASFDNGIGKDFEDGKLTLEQAYEYGKKVGEPKQTSGKQELYEAIVAMYA 5586MI3_ Neocalli- DNA  55ATGGCTAAAGAATTTTTCCCAGAGATTGGTAAAATCAAGTTCGAAGGCAAGGATTCAAAG 005mastigales AATCCAATGGCTTTCCATTACTATGATGCAGAGAAGGTAATCATGGGCAAACCCATGAAGGACTGGCTCCGTTTCGCTATGGCATGGTGGCACACACTCTGTGCAGAGGGCGGCGACCAGTTCGGTGGCGGTACGAAGAAGTTCCCTTGGAACGAGGGTGCTAATGCTGTCGAGATTGCTAAGCAGAAGGCTGACGCTGGTTTCGAAATCATGCAGAAGCTTGGCATTCCTTACTTCTGCTTCCACGATGTTGACCTCGTTTCTGAAGGCGCATCTGTTGAGGAGTATGAGGCCAACCTCAAGGCTATCACTGACTATCTCGCGGTGAAGATGAAGGAGACTGGCATTAAGCTCCTGTGGTCTACTGCCAACGTGTTCGGCAATGGCCGTTACATGAATGGTGCTTCCACCAACCCTGACTTCGACGTTGTTGCTCGCGCCATCGTTCAGATTAAGAACGCTATCGATGCAGGTATCAAGCTCGGTGCTGAGAACTATGTGTTCTGGGGCGGTCGTGAAGGTTACATGAGCCTCCTGAACACAGACCAGAAGCGTGAGAAGGAGCACATGGCTACTATGCTCACTATGGCTCGCGACTACGCTCGCAGCAAGGGCTTCAAGGGTACTTTCCTCATCGAGCCTAAGCCAATGGAGCCATCTAAGCACCAGTACGACGTTGACACAGAGACTGTTATCGGCTTCCTGAAGGCACACAACCTTGACAAGGACTTCAAGGTGAACATCGAGGTGAACCACGCAACACTCGCTGGICACACCTTCGAGCACGAGCTCGCTGTGGCTGTCGACAACAATATGCTTGGTTCTATCGATGCTAACCGCGGTGACTACCAGAATGGTTGGGATACGGACCAGTTCCCAATTGACCAGTACGAGCTCGTTCAGGCTTGGATGGAGATCATCCGTGGTGGCGGTCTCGGCACAGGTGGTACAAACTTCGACGCTAAGACTCGTCGTAACTCTACCGACCTCGAGGACATTTTCATTGCTCACATCGCTGGTATGGACGCTATGGCTCGCGCTCTTGAGTCAGCAGCTAAGCTCCTTGAGGAGTCTCCATACAAGAAGATGAAGGCTGCCCGTTATGCTTCTTTCGACAGCGGCATGGGTAAGGACTTTGAGAACGGCAAGCTCACACTCGAACAGGTTTATGAGTATGGTAAGAAGGTAGGTGAGCCCAAGCAGACTTCTGGCAAGCAGGAGCTCTTCGAGGCAATCGTGGCCATGTACGCATAA 5586MI3_Neocalli- Amino   56MAKEFFPEIGKIKFEGKDSKNPMAFHYYDAEKVIMGKPMKDWLRFAMAWWHTLCAEGGDQ 005mastigales AcidFGGGTKKFPWNEGANAVEIAKQKADAGFEIMQKLGIPYFCFHDVDLVSEGASVEEYEANLKAITDYLAVKMKETGIKLLWSTANVFGNGRYMNGASTNPDFDVVARAIVQIKNAIDAGIKLGAENYVFWGGREGYMSLLNTDQKREKEHMATMLTMARDYARSKGFKGTFLIEPKPMEPSKHQYDVDTETVIGFLKAHNLDKDFKVNIEVNHATLAGHTFEHELAVAVDNNMLGSIDANRGDYQNGWDTDQFPIDQYELVQAWMEIIRGGGLGTGGTNFDAKTRRNSTDLEDIFIAHIAGMDAMARALESAAKLLEESPYKKMKAARYASFDSGMGKDFENGKLTLEQVYEYGKKVGEPKQTSGKQELFEAIVAMYA 5586MI91_ Neocalli- DNA  57ATGGCTAAAGAGTATTTTCCAGAGATTGGTAAAATCAAGTTTGAAGGCAAGGATTCCAAG 002mastigales AATCCAATGGCATTCCACTATTATGATGCAGAGAAAGTGATTATGGGTAAGCCTATGAAGGAGTGGCTCCGCTTTGCAATGGCATGGTGGCACACACTCTGTGCAGAGGGIGGCGACCAGTTTGGTGGTGGCACTAAGAAATTCCCATGGAACGAGGGCACTGACGCTGTGACGATTGCTAAGCAGAAGGCTGATGCAGGTTTCGAAATCATGCAGAAACTCGGTTTCCCATATTTTTGCTTCCACGACATTGACCTCGTTTCCGAAGGCAACAGCATTGAAGAGTATGAGGCTAACCTCCAGGCAATCACTGATTATCTGAAAGTGAAGATGGAAGAGACAGGCATCAAACTCTTGTGGTCAACTGCCAACGTATTCGGCAATGGTCGCTACATGAATGGTGCTTCCACAAACCCAGACTTTGACGTGGTGGCTCGTGCCATCGTTCAGATTAAGAACGCAATTGACGCTGGTATCAAACTCGGTGCTGAGAACTATGTATTCTGGGGCGGTCGCGAAGGCTACATGAGCCTTCTGAACACTGACCAGAAGCGTGAGAAGGAGCACATGGCAACCATGCTCACTATGGCTCGCGACTACGCTCGCAGCAAGGGTTTCAAGGGCACTTTCCTCATTGAGCCAAAGCCAATGGAGCCATCTAAGCACCAGTATGACGTTGACACGGAGACTGTCATCGGCTTCCTCAAGGCACACAACCTCGACAAGGATTTCAAGGTGAACATCGAAGTGAACCACGCTACACTTGCAGGTCATACTTTCGAGCACGAACTTGCTGTGGCTGTTGACAATGGCATGCTCGGTTCTATCGACGCTAACCGTGGTGACTATCAGAACGGTTGGGACACTGACCAGTTCCCAATCGACCAGTACGAACTCGTTCAGGCTTGGATGGAAATCATCCGTGGTGGTGGTCTCGGCACAGGTGGTACTAACTTCGATGCTAAGACTCGTCGTAACTCAACTGACCTCGAGGACATCTTCATCGCACACATCTCTGGTATGGATGCAATGGCACGTGCTCTCGAATCGGCGGCTAAACTTCTTGAGGAGTCTCCATACTGCGCTATGAAGAAGGCTCGTTACGCTTCCTTCGACAGCGGCATCGGTAAGGACTTCGAGGACGGCAAACTCACGCTCGAGCAGGCTTACGAGTACGGCAAGAAAGTCGGCGAACCCAAGCAGACTTCTGGCAAGCAGGAACTCTACGAGGCAATCGTTGCCATGTACGCATAA 5586MI91_Neocalli- Amino   58MAKEYFPEIGKIKFEGKDSKNPMAFHYYDAEKVIMGKPMKEWLRFAMAWWHTLCAEGGDQ 002mastigales AcidFGGGTKKFPWNEGTDAVTIAKQKADAGFEIMQKLGFPYFCFHDIDLVSEGNSIEEYEANLQAITDYLKVKMEETGIKLLWSTANVFGNGRYMNGASTNPDFDVVARAIVQIKNAIDAGIKLGAENYVFWGGREGYMSLLNTDQKREKEHMATMLTMARDYARSKGFKGTFLIEPKPMEPSKHQYDVDTETVIGFLKAHNLDKDFKVNIEVNHATLAGHTFEHELAVAVDNGMLGSIDANRGDYQNGWDTDQFPIDQYELVQAWMEIIRGGGLGTGGTNFDAKTRRNSTDLEDIFIAHISGMDAMARALESAAKLLEESPYCAMKKARYASFDSGIGKDFEDGKLTLEQATEYGKKVGEPKQTSGKQELYEAIVAMYA 5586MI194_ Neocalli- DNA  59ATGGCAAAAGAGTATTTCCCTACGATCGGTAAGATCGTTTATGAAGGACCGGAGTCCAAG 003mastigales AACCCTATGGCATTTCATTACTATGACGCAGAGCGCGTAGTAGCTGGTAAAAAAATGAAAGATTGGATGCGTTTCGCTATGGCATGGTGGCACACCCTCTGTGCAGAAGGTGCAGACCAGTTCGGTGGAGGCACCAAACACTTCCCGTGGAGTGAAGGTCCCGATGCCGTAACCATCGCCAAGCAGAAAGCAGACGCAGGTTTTGAGATCATGCAGAAACTCGGCTTCCCGTATTTCTGTTTCCATGACGTGGATCTGGTCAGCGAAGGCAGCAGCGTAGAAGAGTACGAGGCGAACCTCGCAGCCATCACCGATTATCTCAAGCAGAAAATGGACGAGTCGGGTATCAAACTCCTTTGGTCCACTGCTAACGTATTCGGTCACGCCCGTTACATGAACGGTGCCAGCACCAATCCTGACTTTGATGTCGTTGCCCGTGCGATTGTGCAGATCAAGAATGCTATCGACGCAGGTATCAAACTCGGCGCAGAGAACTACGTCTTCTGGGGCGGTCGTGAAGGTTATATGAGCCTGCTCAATACCGACCAGAAACGCGAGAAAGAGCATACGGCAATGATGCTGCGTATGGCGCGTGACTATGCCCGCAGCAAAGGTTTCAAAGGTACCTTCCTCATCGAACCCAAACCCATGGAGCCGTCCAAGCACCAGTATGACGTAGATACCGAGACGGTGATAGGTTTCCTCAAAGCACACGGTTTGGAGAAAGACTTTAAGGTAAACATCGAAGTGAACCACGCTACCCTCGCCGGTCACACTTTCGAGCACGAACTGGCAGTAGCCGTAGATAACGGCATGCTCGGTTCGATCGATGCCAACCGCGGTGACTATCAGAACGGATGGGATACCGACCAGTTCCCCATCGATAACTTCGAACTGACCCAAGCATGGATGCAGATCGTACGTAACGGTGGTCTCGGCACAGGCGGAACGAACTTCGACTCCAAGACCCGTCGTAACTCCACCGATCTCGAGGATATCTTCATCGCTCACATCAGTGGTATGGACGCTTGTGCCCGTGCCCTATTGAATGCCGTAGAGATCATGGAGAAATCACCGATCCCTGCTATGCTCAAAGAGCGTTACGCTTCCTTCGATAGCGGTCTGGGTAAAGATTTCGAGGACGGCAAACTGACCCTTGAGCAAGTCTATGAGTACGGTAAGAAAGTAGGCGAACCCAAACAAACCAGCGGCAAACAAGAACTCTATGAGGCTATCGTTGCCCTCTACGCTAAATAA 5586MI194_Neocalli- Amino   60MAKEYFPTIGKIVYEGPESKNPMAFHYYDAERVVAGKKMKDWMRFAMAWWHTLCAEGADQ 003mastigales AcidFGGGYKHFPWSEGPDAVYIAKQKADAGFEIMQKLGFPYPCFHDVELVSEGSSVEEYEANLAAITDYLKQKMDESGIKLLWSTANVFGHARYMNGASTNPDFDVVARAIVQIKNAIDAGIKLGAENYVFWGGREGYMSLLNTDQKREKEHTAMMLRMARDTARSKGFKGTFLIEPKPMEPSKHQYDVDTETVIGFLKAHGLEKDFKVNIEVNHATLAGHTFEHELAVAVDNGMLGSIDANRGDYQNGWDTDQFPIDNFELTQAWMQIVRNGGLGTGGTNFDSKTRRNSTDLEDIFIAHISGMDACARALLNAVEIMEKSPIPAMLKERYASFDSGLGKEFEDGKLYLEQVYEYGKKVGEPKQTSGKQELYEAIVALYAK 5586MI198_ Neocalli- DNA  61ATGAAAGAGTATTTCCCTGAGATCGGTAAGATCCAATTTGAAGGCCCGGAGTCCAAGAAC 003mastigales CCGATGGCATTTCACTACTATGACGCAGAGCGCGTCGTAGCCGGTAAAACAATGAAAGAGTGGATGCGTTTCGCTATGGCTTGGTGGCACACCCTCTGTGCGGAAGGCGGCGACCAGTTCGGAGGCGGAACGAAGAAGTTCCCCTGGAACGAAGGCGCTAACGCTTTGGAGATCGCCAAGCACAAAGCCGATGCGGGATTTGAGATCATGCAGAAACTCGGCATCCCTTATTTCTGTTTCCATGACGTGGATCTCATCGCCGAGGGCGGTTCGGTAGAAGAGTACGAAGCCAACCTCGCTGCCATCACCGATTACCTCAAACAGAAAATGGACGAGACTGGCATCAAACTGCTGTGGTCCACGGCGAACGTCTTCAGCAACCCCCGTTATATGAACGGCGCCAGCACGAACCCCGATTTCGATGTAGTAGCGCGTGCCATCGTCCAGATCAAGAACGCTATCGACGCCGGTATCAAACTCGGAGCAGAGAACTATGTCTTCTGGGGTGGTCGCGAGGGCTATATGAGCCTCCTCAACACTGACCAGCGCCGAGAGAAAGAGCATATGGCTACCATGCTCCGTATGGCGCGTGACTACGCGCGTGCCAAAGGATTCAAGGGCACCTTCCTCATCGAACCCAAACCATGTGAGCCGTCCAAACATCAGTATGATGTCGATACCGAGACCGTCATCGGTTTCCTCAAAGCGCATGGACTCGACAAGGATTTCAAAGTCAATATCGAGGTCAACCACGCCACCCTCGCAGGCCACACGTTCGAACACGAACTGGCTTGCGCTGTAGATGCCGGCATGCTCGGTTCGATTGACGCCAACCGCGGTGACGCCCAGAACGGATGGGACACCGACCAGTTCCCTATTGATAACTTCGAACTCACACAGGCTTTCATGCAGATCGTCCGCAACGGCGGTTTCGGAACAGGCGGYACGAACTTCGACGCCAAGACACGCCGTAACTCCACCGACTTGGAGGACATCTTCATCGCCCATATCAGCGGCATGGACGCTTGCGCACGTGCGTTACTCAATGCTGTCGAAATCCTCGAGAAGAGCCCGATTCCGGCGATGCTCAAAGAGCGTTATGCTTCCTTTGACGGCGGCATCGGAAAGGACTTCGAGGAGGGAAAACTGACTTTCGAGCAGGTCTATGAGTACGGCAAGAAAGTCGGCGAACCCAAACAGACCAGCGGCAAACAGGAGCTCTACGAAACCATCGTCGCCCTCTATGCCAAATAG 5586MI198_Neocalli- Amino   62MKEYFPEIGKIQFEGPESKNPMAFHYYDAERVVAGKTMKEWMRFAMAWWHTLCAEGGDQF 003mastigales AcidGGGTKKFPWNEGANALETAKHKADAGFEIMQKLGIPYFCFHDVDLIAEGGSVEEYEANLAAITDYLKQKMDETGIKLLWSTANVFSNPRYMNGASTNPDFDVVARAIVQIKNAIDAGIKLGAENYVFWGGREGYMSLLNTDQRREKEHMATMLRMARDYARAKGFKGTFLIEPKPCEPSKHQYDVDTETVIGFLKAHGLDKDPKVNIEVNHATLAGHTFEHELACAVDAGMLGSIDANRGDAQNGWDYDQFPIDNFELTQAFMQIVRNGGFGTGGTNFDAKTRRNSTDLEDIFIAHISGMDACARALLNAVEILEKSPIPAMLKERYASPDGGIGKDFEEGKLTFEQVYEYGKKVGEPKQTSGKQELYETIVALYAK 5586MI201_ Neocalli- DNA  63ATGGCAAAAGAGTATTTCCCTACGATCGGTAAGATCGTTTATGAAGGACCGGAATCCAAG 003mastigales AACCCTATGGCATTTCATTACTATGACGCAGAGCGCGTAGTAGCTGGTAAAAAAATGAAAGATTGGATGCGTTTCGCTATGGCATGGTGGCACACCCYCTGTGCAGAAGGTGCAGACCAGTTCGGTGGAGGCACCAAACACTTCCCGTGGAATGAAGGTCCCGATGCCGTAACCATCGCCAAGCAGAAAGCAGACGCAGGTTTTGAGATCATGCAGAAACTCGGCTTCCCGTATTTCTGTTTCCATGACGTGGATCTGGTCGGCGAAGGCAGCAGCGTAGAAGAGTACGAGGCGAACCTCGCAGCCATCACCGATTATCTCAAGCAGAAAATGGACGAGTCGGGTATCAAACTCCTTTGGTCCACTGCTAACGTATTCGGTCACGCCCGTTACATGAACGGTGCCAGCACCAATCCTGACTTTGATGTCGTTGCCCGTGCGATTGTGCAGATCAAGAATGCTATCGACGCAGGTATCAAACTCGGCGCAGAGAACTACGTCTTCTGGGGCGGTCGTGAAGGTTATATGAGCCTGCTCAACACCGACCAGAAACGCGAGAAAGAGCATACGGCAATGATGCTGCGTATGGCGCGTGACTATGCCCGCAGCAAAGGTTTCAAAGGTACCTTCCTCATCGAACCCAAACCCATGGAGCCGTCCAAGCACCAGTATGACGTAGATACCGAGACGGTGATAGGTTTCCTCAAAGCACACGGTTTGGAGAAAGACTTTAAGGTAAACATCGAAGTGAACCACGCTACCCTCGCCGGICACACTTTCGAGCACGAACTGGCAGTAGCCGTAGATAACGGCATGCTCGGTTCGATCGATGCCAACCGCGGTGACTATCAGAACGGATGGGATACCGACCAGTTCCCCATCGATAACTTCGAACTGACCCAAGCATGGATGCAGATCGTACGTAACGGTGGTCTCGGCACAGGCGGAACGAACTTCGACTCCAAGACCCGTCGTAACTCCACCGATCTCGAGGATATCTTCATCGCTCACATCAGTGGTATGGACGCTTGTGCCCGTGCCCTATTGAATGCCGTAGAGATCATGGAGAAATCACCGATCCCTGCTATGCTCAAAGAGCGTTACGCTTCCTTCGATAGCGGTCTGGGTAAAGATTTCGAGGACGGCAAACTGACCCTTGAGCAAGTCTATGAGTACGGTAAGAAAGTAGGCGAACCCAAACAAACCAGCGGCAAACAAGAACTCTATGAGGCTATCGTTGCCCTCTACGCTAAATAA 5586MI201_Neocalli- Amino   64MAKEYFPTIGKIVYEGPESKNPMAFHYYDAERVVAGKKMKDWMRFAMAWWHTLCAEGADQ 003mastigales AcidFGGGTKHFPWNEGPDAVTIAKQKADAGFEIMQKLGFPYFCFHDVDINGEGSSVEEYEANLAAITDYLKQKMDESGIKLLWSTANVFGHARYMNGASTNPDFDVVARAIVQIKNAIDAGIKLGAENYVFWGGREGYMSLLNTDQKREKEHTAMMLRMARDYARSKGFKGTFLIEPKPMEPSKHQYDVDTETVIGFLKAHGLEKDFKVNIEVNHATLAGHTFEHELAVAVDNGMLGSIDANRGDYQNGWDTDQFPIDNFELTQAWMQIVRNGGLGTGGTNFDSKTRRNSTDLEDIFIAHISGMDACARALLNAVEIMEKSPIPAMLKERYASFDSGLGKDFEDGKLTLEQVYEYGKKVGEPKQTSGKQELYEAIVALYAK 5586MI204_ Neocalli- DNA  65ATGAAAGAGTATTTCCCTGAGGTCGGTAAGATCCAATTTGAAGGCCCGGAGTCTAAGAAC 002mastigales CCGATGGCATTTCACTACTATGACGCAGAGCGCGTCGTAGCCGGTAAAACAATGAAAGAGTGGATGCGTTTCGCTATGGCTTGGTGGCACACCCTCTGTGCAGAAGGCGGCGACCAGTTCGGAGGCGGAACGAAGCATTTCCCGTGGAATGAAGGCGCTAACGCTTTGGAGATCGCCAAACACAAAGCCGATGCGGGATTCGAGATCATGCAGAAACTCGGCATCCCCTATTTCTGTTTCCATGACGTGGATCTCATCGCCGAGGGCGGTTCGGTAGAAGAGTACGAAACCAACCTCGCTGCTATCACCGACTACCTCAAGCAGAAAATGGACGAGACCGGCATCAAACTGCTGTGGTCCACGGCGAACGTGTTCAGCAACCCCCGTTATATGAACGGCGCGAGCACGAACCCCGATTTCGATGTAGTAGCGCGTGCCATCGTGCAGATCAAGAATGCCATCGACGCCGGCATCAAACTGGGCGCAGAGAACTATGTCTTCTGGGGCGGTCGCGAGGGCTACATGAGCCTGCTCAACACCGACCAGCGCCGCGAGAAAGAGCATATGGCTACTATGCTCCGTATGGCGCGTGACTACGCGCGTGCCAAAGGATTCAAGGGCACCTTTCTCATCGAACCCAAACCGTGTGAGCCGTCCAAACATCAGTATGATGTCGATACCGAGACCGTCATCGGTTTCCTCAAAGCGCATGGACTCGACAAGGATTTCAAGGTTAATATCGAGGTCAACCACGCCACCCTCGCAGGCCACACGTTCGAACACGAACTGGCTTGCGCTGTAGATGCCGGCATGCTCGGTTCGATTGACGCCAACCGCGGTGACGCCCAGAACGGATGGGACACCGACCAGTTCCCTATTGATAACTTCGAACTCACACAGGCTTTCATGCAGATCGTCCGCAACGGCGGTTTCGGAACAGGCGGTACGAACTTCGACGCCAAGACACGCCGTAACTCCACCGACTTGGAGGACATCTTCATCGCCCATATCAGCGGCATGGACGCTTGCGCACGTGCGTTGCTCAACGCCATCGAAATCCTCGAGAAGAGCCCGATCCCGGCTATGCTCAAAGACCGTTATGCCTCCTTTGATGGCGGCATCGGAAAGGACTTTGAGGAGGGCAAACTGACTTTCGAGCAGGTCTATGAGTACGGCAAGAAGGTCGGAGAACCCAAACAGACCAGCGGCAAACAGGAGCTCTACGAAACCATCGTCGCCCTCTATGCCAAATAG 5586MI204_Neocalli- Amino   66MKEYFPEVGKIQFEGPESKNPMAFHYYDAERVVAGKTMKEWMRFAMAWWHTLCAEGGDQF 002mastigales AcidGGGTKHFPWNEGANALEIAKHKADAGFEIMQKLGIPYFCFHDVDLIAEGGSVEEYETNLAAITDYLKQKMDETGIKLLWSTANVFSNPRYMNGASTNPDFDVVARAIVQIKNAIDAGIKLGAENYVFWGGREGYMSLLNTDQRREKEHMATMLRMARDYARAKGFKGTFLIEPKPCEPSKHQYDVDTETVIGFLKAHGLDKDFKVNIEVNHATLAGHTFEHELACAVDAGMLGSIDANRGDAQNGWDTDQFPIDNFELTQAFMQIVRNGGFGTGGTNFDAKTRRNSTDLEDIFIAHISGMDACARALLNAIEILEKSPIPAMLKDRYASFDGGIGKPFEEGKLTFEQVYEYGKKVGEPKQTSGKQELYETIVALYAK 5586MI207_ Neocalli- DNA  67ATGAAAGAGTATTTCCCTGAGATCGGTAAGATCCAATTTGAAGGCCCGGAGTCCAAGAAC 002mastigales CCGATGGCGTTTCACTACTATGACGCTGAGCGCGTCGTAGCCGGTAAAACAATGAAAGAGTGGATGCGTTTCGCTATGGCTTGGTGGCACACCCTCTGTGCGGAAGGCGGCGACCAGTTCGGAGGAGGAACGAAGAAATTCCCCTGGAACGAAGGGGCAAACGCTTTGGAGATCGCCAAGCACAAAGCCGATGCGGGATTCGAGATCATGCAGAAACTCGGCATCCCTTATTTCTGTTTCCATGACGTGGATCTCATCGCCGAGGGCGAATCGGTAGAAGAGTACGAAGCCAACCTCGCTGCCATCACCGATTACCTCAAACAGAAAATGGACGAGACCGGCATCAAACTGCTGTGGTCCACGGCGAACGTGTTCAGCAACCCCCGTTATATGAACGGCGCCAGCACGAACCCCGATTTCGATGTAGTGGCACGCGCTATCGTACAAATCAAGAACGCTATCGACGCCGGTATCAAACTCGGAGCAGAGAACTATGTCTTCTGGGGCGGTCGCGAGGGCTATATGTCGCTCCTCAACACCGACCAGCGCCGAGAGAAAGAGCATATGGCTACTATGCTCCGTATGGCGCGTGACTACGCGCGTTCCAAAGGATTCAAGGGCACCTTCCTCATCGAACCCAAACCGTGTGAGCCGTCCAAACATCAGTACGATGTGGACACAGAGACCGTCATCGGTTTCCTTAAAGCGCATGGACTCGACAAGGATTTCAAAGTCAATATCGAGGTCAACCACGCCACCCTCGCAGGCCACACGTTCGAACACGAACTGGCTTGCGCTGTAGATGCCGGCATGCTCGGTTCGATTGACGCCAACCGCGGTGACGCCCAGAACGGATGGGACACCGACCAATTCCCTATTGATAACTTCGAACTCACTCAGGCTTTCATGCAGATCGTCCGCAACGGCGGTTTCGGAACAGGCGGTACGAACTTCGACGCCAAGACACGCCGTAACTCCACCGACTTGGAGGACATCTTCATCGCCCATATCAGCGGCATGGACGCTTGCGCTCGTGCGTTGCTCAATGCTGTCGAAATCCTCGAGAAGAGCCCGATCCCGGCTATGCTCAAAGAGCGTTATGCTTCCTTTGACGGCGGCATCGGAAAGGACTTTGAGGAGGGCAAACTGACTTTCGAGCAGGTCTATGAGTACGGCAAGAAGGTCGGAGAACCCAAACAGACCAGCGGCAAACAGGAGCTCTACGAAACCATCGTCGCCCTCTATGCCAAATGA 5586MI207_Neocalli- Amino   68MKEYFPEIGKIQFEGPESKNPMAFHYYDAERVVAGKTMKEWMRFAMAWWHTLCAEGGDQF 002mastigales AcidGGGTKKFPWNEGANALEIAKHKADAGFEIMQKLGIPYFCFHDVDLIAEGESVEEYEANLAAITDYLKQKMDETGIKLLWSTANVFSNPRYMNGASTNPDFDVVARAIVQIKNAIDAGIKLGAENYVFWGGREGYMSLLNTDQRREKEHMATMLRMARDYARSKGFKGTFLIEPKPCEPSKHQYDVDTETVIGFLKAHGLDKDFKVNIEVNHATLAGHTFEHELACAVDAGMLGSIDANRGDAQNGWDTDQFPIDNFELTQAFMQIVRNGGFGTGGTNFDAKTRRNSTDLEDIFIAHISGMDACARALLNAVEILEKSPIPAMLKERYASFDGGIGKDFEEGKLTFEQVYEYGKKVGEPKQTSGKQELYETIVALYAK 5586MI209_ Neocalli- DNA  69ATGAAAGAGTATTTCCCTGAGATCGGTAAGATCCAATTTGAAGGCCCGGAGTCCAAGAAC 003mastigales CCGATGGCGTTTCACTACTATGACGCAGAGCGCGTAGTAGCCGGTAAAACAATGAAAGAATGGATGCGTTTCGCCATGGCATGGTGGCACACCCTCTGTGCAGAAGGCGGCGACCAGTTCGGAGGAGGAACGAAGCATTTCCCGTGGAATGAAGGCGCTAACGCTTTGGAGATCGCCAAACACAAAGCCGATGCGGGATTCGAGATCATGCAGAAACTCGGCATCCCCTATTTCTGTTTCCATGACGTGGATCTCATCGCCGAGGGCGATTCGGTGGAGGAGTACGAAGCTAACCCCGCTGCCATCACCGATTACCTCAAACAGAAAATGGACGAGACCGGCATCAAACTGCTGTGGTCCACGGCGAACGTCTTCAGCAACCCCCGTTACATGAACGGTGCGAGCACGAACCCGGATTTCGATGTAGTGGCACGCGCTATCGTACAAATCAAGAACGCTATCGACGCCGGTATCAAACTCGGAGCAGAGAACTATGTCTTCTGGGGCGGTCGCGAGGGCTATATGTCGCTCCTCAACACCGACCAGCGTCGCGAGAAAGAGCATATGGCTACTATGCTCCGTATGGCGCGTGACTACGCGCGTGCCAAAGGATTCAAGGGCACCTTCCTCATCGAACCCAAACCATGTGAGCCGTCCAAACATCAGTACGATGTGGACACAGAGACTGTCATCGGTTTCCTCAAAGCGCATGGACTCGACAAGGATTTCAAAGTCAACATCGAGGTCAACCACGCCACCCTCGCAGGTCACACGTTCGAACACGAACTGGCTTGCGCTGTAGATGCCGGCATGCTCGGTTCGATTGACGCCAACCGCGGTGACGCCCAGAACGGATGGGACACTGACCAGTTCCCTATTGATAACTTCGAACTCACACAGGCTTTCATGCAGATCGTCCGCAACGGCGGTTTCGGAACAGGCGGTACGAACTTCGACGCCAAGACACGCCGTAACTCCACCGACTTGGAGGACATCTTCATCGCCCATATCAGCGGCATGGACGCTTGTGTCCGTGCGTTGCTCAACGCCATCGAAATCCTCGAGAAGAGCCCGATCCCGGCTATGCTCAAAGAGCGTTACGCTTCCTTTGACGGCGGCATCGGAAAGGACTTTGAGGATGGTAAACTGACTTTCGAGCAGGTCTATGAGTACGGCAAGAAGGTCGGAGAACCCAAACAGACCAGCGGCAAACAGGAGCTCTACGAAACCATCGTCGCCCTCTATGCCAAGTAA 5586MI209_Neocalli- Amino   70MKEYFPEIGKIQFEGPESKNPMAFHYYDAERVVAGKTMKEWMRFAMAWWHTLCAEGGDQF 003mastigales AcidGGGTKHFPWNEGANALEIAKHKADAGFEIMQKLGIPYFCFHDVDLIAEGDSVEEYEANPAAITDYLKQKMDETGIKLLWSTANVFSNPRYMNGASTNPDFDVVARAIVQIKNAIDAGIKLGAENYVFWGGREGYMSLLNTDQRREKEHMATMLRMARDYARAKGFKGTFLIEPKPCEPSKHQYDVDTETVIGFLKAHGLDKDFKVNIEVNHATLAGHTFEHELACAVDAGMLGSIDANRGDAQNGWDTDQFPIDNFELTQAFMQIVRNGGFGTGGTNFDAKTRRNSTDLEDIFIAHISGMDACVRALLNAIEILEKSPIPAMLKERYASFDGGIGKDFEDGKLTFEQVTEYGKKVGEPKQTSGKQELYETIVALYAK 5586MI214_ Neocalli- DNA  71ATGAAAGAGTATTTCCCTGAGATCGGAAAGATCCAATTCGAAGGCCCGGAGTCCAAGAAT 002mastigales CCTATGGCATTTCACTACTATGACGCAGAGCGTGTAGTAGCCGGTAAAACAATGAAAGAGTGGATGCGTTTCGCTTTGGCATGGTGGCACACGCTCTGCGCAGAAGGCGGCGACCAGTTCGGAGGCGGCACGAAGCATTTCCCTTGGAATGAAGGTGCAAACGCTTTGGAGATCGCCAAGCACAAAGCCGATGCAGGCTTCGAGATCATGCAGAAACTCGGCATCCCCTATTTCTGTTTCCATGACGTGGATCTGATCGCCGAGGGCGGTTCGGTAGAAGAGTATGAAGCTAATTTAACGGCTATCACCGATTACCTCAAACAGAAAATGGACGAGACCGGCATCAAACTGCTGTGGTCCACTGCGAACGTGTTCGGTAACGCACGTTATATGAACGGCGCGAGCACGAACCCCGATTTCGATGTAGTGGCACGCGCTATCGTGCAGATCAAGAACGCTATCGACGCCGGCATCAAACTGGGCGCAGAGAACTACGTCTTCTGGGGCGGTCGCGAGGGATATATGTCGCTCCTGAACACCGACCAGAAGCGTGAGAAAGAGCATATGGCTACCATGCTCCGTATGGCGCGTGACTACGCGCGTTCCAAAGGATTCAAAGGTACGTTCCTCATCGAGCCCAAACCGTGTGAGCCGTCCAAACATCAGTACGACGTGGACACTGAGACCGTCATCGGTTTCCTCAAAGCCCATGGTCTCGGCAAGGATTTCAAAGTGAACATCGAGGTGAATCACGCCACCCTCGCAGGGCACACGTTCGAACACGAACTGGCTTGCGCCGTAGATGCCGGCATGCTCGGTTCGATCGACGCCAACCGCGGTGACGCACAAAACGGATGGGACACCGACCAGTTCCCTATTGATAATTTCGAACTCACCCAGGCATTCATGCAGATCGTCCGCAACGGCGGTTTCGGAACAGGCGGTACGAACTTCGACGCCAAGACACGCCGTAATTCCACCGACTTGGAGGACATCTTCATCGCCCATATCAGCGGCATGGACGCTTGTGCCCGTGCGTTGCTCAATGCTGTCGAAATCCTTGAAAAGAGCCCGATCCCGGCGATGCTCAAAGAGCGTTACGCCTCCTTTGACAGCGGTATGGGTAAGGACTTTGAGGAGGGCAAGCTGACCTTCGAGCAGGTCTATGAGTACGGCAAACAGGTCGGCGAACCCAAACAGACCAGCGGCAAGCAGGAGCTCTACGAAACCATCGTCGCCCTCTATGCCAAATAG 5586MI214_Neocalli- Amino   72MKEYFPEIGKIQFEGPESKNPMAFHYYDAERVVAGKTMKEWMRFALAWWHTLCAEGGDQF 002mastigales AcidGGGTKHFPWNEGANALEIAKHKADAGFEIMQKLGIPYFCFHDVDLIAEGGSVEEYEANLTAITDYLKQKMDETGIKLLWSTANVFGNARYMNGASTNPDFDVVARAIVQIKNAIDAGIKLGAENYVFWGGREGYMSLLNTDQKREKEHMATMLRMARDYARSKGFKGTFLTEPKPCEPSKHQYDVDTETVIGFLKAHGLGKDPKVNIEVNHATLAGHTFEHELACAVDAGMLGSIDANRGDAQNGWDTDQFPIDNFELTQAFMQIVRNGGFGTGGTNFDAKTRRNSTDLEDIFIAHISGMDACARALLNAVEILEKSPIPAMLKERYASFDSGMGKDFEEGKLTFEQVYEYGKQVGEPKQTSGKQELYETIVALYAK 5751MI3_ Neocalli- DNA  73ATGAAAGAGTATTTTCCACAAATCGGCAAGATCCCATTTGAGGGACCAGAGTCAAAGAAC 001mastigales CCAATGGCATTCCACTACTATGACGCAGAGCGCGTAGTTGCCGGTAAGACAATGAAGGAATGGATGCGTTTCGCTATGGCCTGGTGGCACACTCTCTGTGCTGAGGGTAGCGATCAGTTCGGCCCTGGTACAAAGAAGTTCCCTTGGAACGAGGGCGAGACAGCCCTTGAGCGCGCTAAGCACAAGGCAGATGCTGGCTTCGAGGTTATGCAGAAGCTCGGCATCCCATATTTCTGCTTCCACGATGTAGACCTTATCGACGAGGGTGCTAACGTGGCTGAGTATGAGGCAAACCTCGCTGCTATCACTGACTACCTGAAGGAGAAGATGGAGGAGACTGGCGTAAAGCTCCTCTGGTCTACAGCCAACGTGTTCGGTAACGCTCGCTATATGAACGGTGCTTCTACAAATCCTGACTTCGACGTTGTGGCTCGTGCCATCGTACAGATTAAGAACGCTATCGACGCTGGTATCAAGCTTGGTGCTGAGAACTACGTGTTCTGGGGCGGCCGCGAGGGCTACATGAGCCTTCTGAACACTGACCAGAAGCGCGAGAAGGAGCACATGGCAACTATGCTCGGCATGGCTCGCGACTATGCCCGCGCTAAGGGATTCACCGGTACCTTCCTCATTGAGCCAAAGCCAATGGAGCCAACAAAGCATCAGTATGATGTTGACACAGAGACCGTTATCGGTTTCCTCAAGGCTCACGGTCTGGACAAGGACTTCAAGGTGAACATCGAGGTGAACCACGCTACTCTCGCCGGTCACACCTTCGAGCACGAGCTCGCTTGCGCTGTTGACGCTGGTATGCTCGGTTCTATCGACGCTAACCGCGGTGACGCTCAGAACGGATGGGATACCGACCAGTTCCCAATCGACAACTTCGAGCTGACACAGGCTTGGATGCAGATTGTTCGCAATGGCGGTCTTGGCACAGGTGGTACCAACTTCGACGCAAAGACCCGTCGTAACTCTACCGACCTCGAGGACATCTTCATCGCTCACATCTCCGGTATGGACGCTTGTGCACGCGCTCTCCTCAACGCAGTAGAGATACTCGAGAACTCTCCAATCCCAACAATGCTGAAGGACCGCTATGCAAGCTTCGACTCAGGTATGGGTAAGGACTTCGAGGACGGCAAGCTCACACTTGAGCAGGTTTATGAGTATGGTAAGAAGGTCGACGAGCCAAAGCAGACCTCTGGTAAGCAGGAACTCTATGAGACCATCGTTGCTCTCTATGCAAAATAA 5751MI3_Neocalli- Amino   74MKEYFPQIGKIPFEGPESKNPMAFHYYDAERVVAGKTMKEWMPFAMAWWHTLCAEGSDQF 001mastigales AcidGPGTKKFPWNEGETALERAKHKADAGFEVMQKLGIPYFCFHDVDLIDEGANVAEYEANLAAITDYLKEKMEETGVKLLWSTANVFGNARYMNGASTNPDFDVVARAIVQIKNAIDAGIKLGAENYVFWGGREGYMSLLNTDQKREKEHMATMLGMARDYARAKGFTGTFLIEPKPMEPTKHQYDVDTETVIGFLKAHGLDKDFKVNIEVNHATLAGHTFEHELACAVDAGMLGSIDANRGDAQNGWDTDQFPIDNFELTQAWMQIVRNGGLGTGGTNFDAKTRRNSTDLEDIFIAHISGMDACARALLNAVEILENSPIPTMLKDRYASFDSGMGKDFEDGKLTLEQVYEYGKKVDEPKQTSGKQELYETIVALYAK 5753MI3_ Prevotella DNA  75ATGGCTAAAGAATACTTCCCCTCCATCGGCAAAATCCCTTTTGAAGGAGGCGACAGCAAA 002AATCCCCTCGCTTTCCATTATTATGACGCCGGACGCGTGGTTATGGGCAAGCCCATGAAGGAATGGCTTAAATTCGCCATGGCCTGGTGGCACACGCTGGGCCAGGCCTCCGGAGACCCCTTCGGCGGCCAGACCCGCAGCTACGAATGGGACAAGGGCGAATGCCCCTACTGCCGCGCCAAAGCCAAGGCCGACGCCGGTTTTGAAATCATGCAAAAGCTGGGTATCGAATACTTCTGCTTCCACGATGTGGACCTTATCGAGGATTGCGATGACATTGCCGAATACGAAGCCCGCATGAAGGACATCACGGACTACCTGCTGGAAAAGATGAAGGAGACCGGCATCAAGAACCTCTGGGGCACCGCCAATGTCTTCGGCCACAAGCGCTACATGAACGGCGCCGGCACCAATCCGCAGTTCGATGTGGIGGCCCGTGCCGCCGTCCAGATCAAGAACGCCCTGGACGCCACCATCAAGCTGGGCGGCTCCAACTATGTGTTCTGGGGCGGCCGCGAAGGCTATTACACCCTCCTCAACACCCAGATGCAGCGGGAAAAAGACCACCTGGCCAAGTTGCTGACGGCCGCCCGCGACTATGCCCGCGCCAAGGGCTTCAAGGGCACCTTCCTCATTGAGCCCAAACCCATGGAACCCACCAAGCACCAGTACGACGTGGATACGGAGACGGTCATCGGCTTCCTCCGTGCCAACGGCCTGGACAAGGACTTCAAGGTGAACATCGAGGTGAACCACGCCACCCTGGCCGGCCACACCTICGAGCATGAGCTCACCGTGGCCCGCGAGAACGGTTTCCTGGGCTCCATCGGTGCCAACCGCGGCGACGCCCAGAACGGCTGGGACACGGACCAGTTCCCTGTGGACCCGTACGATCTTACCCAGGCCATGATGCAGGTGCTGCTGAACGGCGGCTTCGGCAACGGCGGCACCAACTTCGACGCCAAACTCCGCCGCTCCTCCACCGACCCTGAGGACATCTTCATCGCCCATATTTCCGCCATGGATGCCATGGCCCACGCTTTGCTTAACGCAGCTGCCGTGCTGGAAGAGAGCCCCCTGTGCCAGATGGTCAAGGAGCGTTATGCCAGCTTCGACGGCGGCCTCGGCAAACAGTTCGAGGAAGGCAAGGCTACCCTGGAAGACCTGTACGAATACGCCAAGGTCCAGGGTGAACCCGTTGTCGCCTCCGGCAAGCAGGAGCTTTACGAGACTCTCCTGAACCTGTATGCCGTCAAGTAA 5753MI3_Prevotella Amino   76MAKEYFPSIGKIPFEGGDSKNPLAFHYYDAGRVVMGKPMKEWLKFAMAWWHTLGQASGDP 002 AcidFGGQTRSYEWDKGECPYCRAKAKADAGFEIMQKLGIEYFCFHDVDLIEDCDDIAEYEARMKDITDYLLEKMKETGIKNLWGTANVFGHKRYMNGAGTNPQFDVVARAAVQIKNALDATIKLGGSNYVFWGGREGYYTLLNTQMQREKDHLAKLLTAARDYARAKGFKGTFLIEPKPMEPTKHQYDVDTETVIGFLRANGLDKDFKVNIEVNHATLAGHTFEHELTVARENGFLGSIGANRGDAQNGWDTDQFPVDPYDLTQAMMQVLLNGGFGNGGTNFDAKLRRSSTDPEDIFIAHISAMDAMAHALLNAAAVLEESPLCQMVKERYASFDGGLGKQFEEGKATLEDLYEYAKVQGEPVVASGKQELYETLLNLYAVK 1754MI1_ Prevotella DNA  77ATGGCAAAAGAGTATTTTCCGTTTACCGGTAAGATTCCTTTCGAAGGAAAGGACAGTAAG 001AATGTAATGGCTTTCCACTACTACGAGCCTGAGAAGGTCGTGATGGGAAAGAAGATGAAGGACTGGCTGAAGTTCGCTATGGCTTGGTGGCATACACTGGGTGGCGCTTCTGCTGACCAGTTTGGTGGTCAGACTCGTTCATACGAGTGGGACAAGGCTGGTGACGCTGTTCAGCGCGCTAAGGATAAGATGGACGCTGGCTTCGAGATCATGGACAAGCTGGGCATCGAGTACTTCTGCTTCCACGATGTTGACCTCGTTGAAGAGGGTGACACCATCGAGGAGTATGAGGCTCGCATGAAGGCCATCACCGACTACGCTCAGGAGAAGATGAAGCAGTTCCCCAACATCAAGCTGCTCTGGGGTACCGCAAACGTATTCGGTAACAAGCGCTATGCTAACGGTGCTTCTACCAACCCCGACTTCGACGTAGTGGCTCGCGCCATCGTTCAGATCAAGAACGCTATTGATGCTACCATCAAGCTGGGTGGTACCAACTATGTGTTCTGGGGTGGTCGTGAGGGCTATATGAGTCTGCTGAACACCGACCAGAAGCGTGAGAAGGAGCACATGGCTACTATGCTGACCATGGCTCGCGACTATGCTCGCGCCAAGGGATTCAAGGGTACATTCCTCATTGAGCCGAAGCCCATGGAGCCCAGCAAGCACCAGTATGATGTGGATACAGAGACCGTTATCGGCTTCCTGAAGGCACACAACCTGGACAAGGACTTCAAGGTGAACATCGAGGTGAACCACGCTACACTCGCTGGTCATACCTTCGAGCACGAGCTGGCTTGCGCTGTTGACGCTGGTATGCTTGGTTCTATCGACGCTAACCGTGGTGATGCTCAGAACGGTTGGGATACCGACCAGTTCCCCATCGACAACTACGAGCTGACACAGGCTATGCTCGAGATCATCCGCAATGGIGGTCTGGGCAATGGTGGTACCAACTTCGATGCTAAGATCCGTCGTAACAGCACCGACCTCGAGGATCTCTTCATCGCTCACATCAGTGGTATGGATGCTATGGCACGCGCTCTGATGAACGCTGCTGACATCCTTGAGAACTCTGAGCTGCCCGCAATGAAGAAGGCTCGCTACGCAAGCTTCGACCAGGGTGTTGGTAAGGACTTCGAAGATGGCAAGCTGACCCIIGAGCAGGTITACGAGTATGGTAAGAAGGTGGGTGAGCCCAAGCAGACTTCTGGTAAGCAGGAGAAGTACGAGACCATCGTTGCTCTCTATGCAAAATAA 1754MI1_Prevotella Amino   78MAKEYFPFTGKIPFEGKDSKNVMAFHYTEPEKVVMGKKMKDWLKFAMAWWHTLGGASADQ 001 AcidFGGQTRSYEWDKAGDAVQRAKDKMDAGFEIMDKLGIEYFCFHDVDLVEEGDTIEEYEARMKAITDYAQEKMKQFPNIKLLWGTANVFGNKRYANGASTNPDFDVVARAIVQIKNAIDATIKLGGTNYVFWGGREGYMSLLNTDQKREKEHMATMLTMARDYARAKGFKGTFLIEPKPMEPSKHQYDVDTETVIGFLKAHNLDKPFKVNIEVNHATLAGHTFEHELACAVDAGMLGSIDANRGDAQNGWDTDQFPIDNYELTQAMLEIIRNGGLGNGGTNFDAKIRRNSTDLEDLFIAHISGMDAMARALMNAADILENSELPAMKKARYASFDQGVGKDFEDGKLTLEQVYEYGKKVGEPKQTSGKQEKYETIVALYAK 1754MI3_ Prevotella DNA  79ATGGCAAAAGAGTATTTTCCGTTTACCGGTAAGATTCCTTTCGAAGGAAAAGAGAGCAAG 007AACGTAATGGCTTTCCATTACTATGAGCCTGAAAAGGTGGTCATGGGCAAGAAAATGAAGGATTGGCTGAAATTCGCCATGGCTTGGTGGCACACCCTCGGTGGAGCCAGCGCCGACCAGTTCGGTGGACAGACCCGCAGCTATGAGTGGGACAAGGCCGAGGATGCCGTACAGCGTGCTAAGGACAAGATGGACGCCGGCTTCGAGATCATGGACAAACTGGGCATCGAGTATTTCTGCTTCCACGATGTCGACCTCGTCGACGAGGGTGCTACCGTTGAGGAGTATGAGGCTCGCATGAAAGCCATCACCGACTATGCCCAGGTCAAGATGAAGGAATATCCCAACATCAAACTGCTCTGGGGCACCGCCAACGTGTTCGGCAACAAGCGTTATGCCAACGGCGCTTCCACCAACCCCGACTTCGACGTGGTGGCACGCGCTATCGTTCAGATCAAGAATGCCATCGACGCTACCATCAAGCTCGGCGGTCAGAACTACGTGTTCTGGGGCGGACGCGAGGGCTACATGAGCCTGCTCAATACCGATCAGAAACGTGAGAAGGAACACATGGCCACCATGCTCACCATGGCGCGCGACTATGCTCGCAGCAAGGGATTCAAGGGCACCTTCCTCATCGAACCCAAACCCATGGAGCCTTCCAAGCACCAGTATGATGTCGACACCGAGACGGTCATCGGCTTCCTCCGCGCCCACAACCTCGACAAGGACTTCAAGGTGAACATCGAGGTCAACCACGCCACGCTCGCCGGCCACACCTTCGAGCACGAACTGGCTTGCGCCGTCGACGCCGGCATGCTCGGCAGCATCGACGCCAACCGCGGCGACGCACAGAACGGCTGGGATACCGACCAGTTCCCCATCGACAACTACGAACTGACACAGGCCATGCTGGAGATCATCCGCAATGGCGGCCTCGGCAATGGTGGTACCAACTTCGACGCCAAGATCCGTCGTAACAGCACCGACCTCGAAGATCTCTTCATCGCTCACATCAGCGGTATGGATGCCATGGCTCGCGCGCTGCTCAACGCCGCCGCCATCCTCGAGGAGAGCGAACTGCCCGCCATGAAGAAGGCCCGCTACGCTTCCTTCGACGAAGGTATCGGCAAGGACTTCGAAGACGGCAAACTCACCCTCGAGCAGGTTTACGAGTACGGCAAGAAGGTAGGCGAGCCCAAGCAGACCTCCGGCAAGCAAGAGAAGTACGAGACCATCGTGGCTCTCTACAGCAAATAA 1754MI3_Prevotella Amino   80MAKEYFPFTGKIPFEGKESKNVMAFHYYEPEKVVMGKKMKDWLKFAMAWWHTLGGASADQ 007 AcidFGGQTRSYEWDKAEDAVQRAKDKMDAGFEIMDKLGIEYFCFHDVDLVDEGATVEEYEARMKAITDYAQVKMKEYPNIKLLWGTANVFGNKRYANGASTNPDFDVVARAIVQIKNAIDATIKLGGQNYVFWGGREGYMSLLNTDQKREKEHMATMLTMARDYARSKGFKGTFLIEPKPMEPSKHQYDVDTETVIGFLRAHNLDKDFKVNIEVNHATLAGHTFEHELACAVDAGMLGSIDANRGDAQNGWDTDQFPIDNYELTQAMLEIIRNGGLGNGGTNFDAKIRRNSTDLEDLFIAHISGMDAMARALLNAAAILEESELPAMKKARYASFDEGIGKDFEDGKLTLEQVYEYGKKVGEPKQTSGKQEKYETIVALYSK 1754MI5_ Prevotella DNA  81ATGAAAGAGTATTTCCCGCAAATTGGAAAGATTCCCTTCGAGGGACCAGAGAGCAAGAGT 009CCATTGGCGTTCCATTATTATGAGCCGGATCGCATGGTGCTCGGAAAGAGGATGGAGGATTGGCTGAAATTCGCCATGGCATGGTGGCACACCCTTGGCCAGGCCAGCGGCGACCAGTTCGGCGGACAGACACGTGAGTACGAGTGGGATAAGGCTGGAGATCCGATACAAAGGGCAAAGGATAAGATGGACGCCGGATTCGAGATCATGGAGAAATTGGGTATCAAGTACTTCTGCTTCCATGATGTGGATCTCGTCGAGGAAGCTCCCACCATCGCCGAATATGAGGAGCGTATGAGGATCATCACCGACTATGCGCTCGAGAAGATGAAAGCCACTGGCATCAAACTCCTTTGGGGTACAGCCAATGTTTTCGGACATAAGAGATATATGAATGGGGCCGCCACCAACCCGGAGTTCGGTGTTGTCGCCAGGGCTGCTGTCCAGATCAAGAACGCGATCGACGCCACCATCAAGCTGGGAGGAACAAACTATGTGTTCTGGGGTGGCCGCGAGGGCTACATGAGCCTGCTCAACACCCAGATGCAGAGGGAGAAGGACCATCTCGCCAATATGCTCAAGGCTGCTCGTGACTATGCTCGCGCCAAGGGATTCAAGGGCACATTCCTCATCGAGCCGAAGCCGATGGAACCTACTAAGCATCAGTACGATGTCGACACTGAGACCGTGATCGGCTTCCTCCGCGCAAACGGTCTTGACAAGGATTTCAAGGTCAACATCGAGGTCAATCACGCCACTCTTGCGGGTCACACTTTCGAGCATGAGCTCGCCGTGGCTGTCGACAATGGTCTCCTTGGCTCAATCGATGCGAACAGGGGAGATTATCAGAACGGTTGGGACACCGACCAGTTCCCTGTTGATCTCTTTGATTTGACCCAGGCCATGCTCCAGATCATCCGTAACGGAGGCCTCGGTAATGGIGGATCCAACTTCGACGCCAAGCTTCGCCGTAACTCCACTGATCCTGAGGATATATTCATTGCCCATATTTGCGGTATGGACGCTATGGCCAGGGCTCTCCTTGCCGCCGCCGCGATCGTGGAGGAGTCTCCTATCCCGGCTATGGTCAAAGAGCGTTACGCATCCTTCGACGAAGGTGAGGGCAAGAGATTCGAGGATGGTAAGATGAGTCTGGAGGAACTTGTTGATTACGCGAAGACTCACGGAGAGCCCGCCCAGAAGAGTGGCAAACAGGAGCTCTACGAAACCCTTGTCAACATGTACATCAAATAA 1754MI5_Prevotella Amino   82MKEYFPQIGKIPFEGPESKSPLAFHYYEPDRMVDGKRMEDWLKFAMAWWHTLGQASGDQF 009 AcidGGQTREYEWDKAGDPIQRAKDKMDAGFEIMEKLGIKYFCFHDVDLVEEAPTIAEYEERMRIITDYALEKMKATGIKLLWGTANVFGHKRYMNGAATNPEFGVVARAAVQIKNAIDATIKLGGTNYVFWGGREGYMSLLNTQMQREKDHLANMLKAARDYARAKGFKGTFLIEPKPMEPTKHQYDVDTETVIGFLRANGLDKDFKVNIEVNHATLAGHTFEHELAVAVDNGLLGSIDANRGDYQNGWDTDQFPVDDFDLTQAMLQIIRNGGLGNGGSNFDAKLRRNSTDPEDIFIAHICGMDAMARADDAAAAIVEESPIPAMVKERYASFDEGEGKRFEDGKMSDEELVDTAKTHGEPAQKSGKQELYETLVNMYIK 5586MI1_ Prevotella DNA  83ATGGCAAAAGAGTATTTTCCGTTTACCGGTAAGATTCCTTTCGAGGGAAAGGACAGTAAG 003AATGTAATGGCGTTCCACTACTACGAGCCCGAGCGCGTGGTAATGGGCAAGAAGATGAAGGAGTGGCTGAAGTTTGCCATGGCCTGGTGGCACACGCTGGGTGGAGCCAGTGCCGACCAGTTTGGCGGACAGACCCGCAGCTACGAGTGGGACAAGGCTGAAGACGCCGTGCAGCGTGCCAAGGACAAGATGGATGCCGGCTTCGAGATCATGGACAAGCTGGGCATCGAGTATTTCTGCTTCCATGATGTCGATCTCGTTGACGAGGGTGCCACTGTCGAGGAGTATGAGGCTCGCATGCAGGCCATCACCGACTATGCGCAGGAGAAGATGAAGCAGTATCCTGCCATCAAGCTGCTGTGGGGTACGGCCAATGTCTTTGGCAACAAGCGTTATGCCAACGGTGCCTCTACCAATCCCGACTTCGATGTGGTGGCCCGCGCCATCGTGCAGATTAAGAATGCCATTGATGCCACCATCAAGCTGGGCGGCAGCAACTATGTGTTCTGGGGCGGTCGCGAGGGCTACATGTCGCTGCTCAACACCGACCAGAAGCGTGAGAAGGAACACATGGCCCGGATGCTGACCATGGCCCGCGACTATGCCCGCTCGAAGGGCTTCAAGGGCAACTTCCTGATTGAGCCCAAGCCCATGGAGCCGTCGAAGCATCAGTACGACGTGGACACCGAGACGGTTATCGGATTCCTCCGCGCACATGGCCTTGACAAGGACTTCAAGGTGAACATCGAGGTGAACCATGCCACGCTGGCCGGTCATACCTTCGAGCACGAACTGGCTTGCGCCGTAGATGCCGGCATGCTGGGCAGCATTGATGCCAACCGCGGCGACGCACAGAACGGATGGGACACCGACCAGTTCCCCATCGACAACTATGAGTTGACACAGGCCATGATGGAGATTATCCGCAATGGCGGTCTGGGTCTTGGCGGTACCAATTTCGATGCCAAGATTCGCCGTAACTCCACCGACCTGGAAGACCTCTTCATCGCCCACATCAGTGGCATGGACGCCATGGCTCGTGCGCTCCTTAATGCTGCCGACATTCTGGAGAACAGCGAACTGCCCGCCATGAAGAAAGCGCGCTACGCCTCGTTCGACAGTGGCATGGGCAAGGACTTCGAGGACGGCAAACTGACCCTTGAGCAGGTTTACGAATACGGCAAAAAAGTCGGCGAACCTAAGCAGACCTCCGGCAAGCAGGAGAAGTACGAGACCATCGTGGCTCTCTATGCCAAGTAA 5586MI1_Prevotella Amino   84MAKEYFPFTGKIPFEGKDSKNVMAFHYTEPERVVMGKKMKEWLKFAMAWWHTLGGASADQ 003 AcidFGGQTRSYEWDKAEDAVQRAKDKMDAGFEIMDKDGIEYFCFHDVDLVDEGATVEEYEARMQAITDYAQEKMKQYPAIKLLWGTANVFGNKRYANGASTNPDFDVVARAIVQIKNAIDATIKLGGSNYVFWGGREGYMSLLNTDQKREKEHMARMLTMARDYARSKGFKGNFLIEPKPMEPSKHQYDVDTETVIGFDRAHGLDKDEKVNIEVNHATLAGHTFEHELACAVDAGMLGSIDANRGDAQNGWDTDQFPIDNYELTQAMMEIIRNGGLGLGGTNFDAKIRRNSTDLEDLFIAHISGMDAMARALLNAADILENSELPAMKKARYASFDSGMGKDFEDGKLTLEQVYEYGKKVGEPKQTSGKQEKTETIVADYAK 5586MI2_ Prevotella DNA  85ATGGCAAAAGAGTATTTTCCGTTTACAGGTAAAATTCCTTTCGAAGGAAAGGACAGTAAG 006AACGTAATGGCTTTCCACTACTACGAGCCCGAAAAGGTCGTGATGGGAAAGAAAATGAAAGACTGGCTGAAGTTCGCCATGGCCTGGTGGCACACACTGGGTGGCGCCAGCGCCGACCAGTTTGGCGGCCAGACACGCAGCTATGAGTGGGACAAGGCTGCCGATGCCGTGCAGCGCGCAAAGGACAAGATGGACGCCGGCTTCGAAATCATGGACAAGCTGGGCATCGAGTATTTCTGCTTCCACGACGTGGACCTCGTTGAGGAGGGAGCCACCATCGAGGAGTATGAGGCCCGCATGAAGGCTATCACCGACTATGCCCAGGAGAAGATGAAACAGTATCCCAGCATCAAGCTGCTCTGGGGCACCGCCAATGTGTTTGGCAACAAGCGCTACGCCAACGGCGCCAGCACCAACCCCGACTTCGACGTCGTGGCCCGTGCCATCGTGCAGATCAAGAACGCCATCGATGCCACCATCAAGCTGGGCGGCACCAACTACGTGTTCTGGGGCGGACGCGAGGGCTACATGAGCCTGCTCAACACCGACCAGAAGCGCGAGAAGGAGCACATGGCCACCATGCTCACCATGGCCCGCGACTACGCCCGCGCAAAGGGATTCAAGGGCACCTTCCTCATCGAGCCCAAGCCCATGGAGCCGTCGAAGCACCAGTACGACGTGGACACCGAGACCGTCATCGGTTTCCTGAAGGCCCACGGTCTGGACAAGGACTTCAAGGTGAACATCGAGGTGAACCACGCCACGCTGGCCGGCCACACCTTCGAGCATGAGCTGGCCTGCGCCGTCGACGCCGGTATGCTGGGCAGCATCGATGCCAACCGCGGCGACGCCCAGAACGGCTGGGACACCGACCAGTTCCCCATCGACAACTTCGAGCTCACCCAGGCCATGATGGAAATTATCCGCAACGGCGGCCTCGGCAACGGCGGCACCAACTTCGACGCTAAGATCCGCCGCAACTCCACCGACCTCGAGGACCTCTTCATCGCCCACATCAGCGGCATGGACGCCATGGCCCGCGCACTGATGAACGCTGCCGACATTATGGAGAACAGCGAGCTGCCCGCCATGAAGAAGGCACGCTACGCCAGCTTCGACGCCGGCATCGGCAAGGACTTTGAGGATGGCAAGCTCTCGCTGGAGCAGGTCTACGAGTATGGCAAGAAGGTGGAAGAGCCCAAGCAGACCAGCGGCAAGCAGGAGAAGTACGAGACCATCGTCGCCCTCTATGCCAAGTAA 5586MI2_Prevotella Amino   86MAKEYFPFTGKIPFEGKDSKNVMAFHYYEPEKVVMGKKMKDWLKFAMAWWHTLGGASADQ 006 AcidFGGQTRSYEWDKAADAVQRAKDKMDAGFEIMDKLGIEYFCFHDVDLVEEGATIEEYEARMKAITDYAQEKMKQYPSIKLLWGTANVFGNKRYANGASTNPDFDVVARAIVQIKNAIDATIKLGGTNYVFWGGREGYMSLLNTDQKREKEHMATMLTMARDYARAKGFKGTFLIEPKPMEPSKHQYDVDTETVIGFLKAHGLDKDFKVNIEVNHATLAGHTFEHELACAVDAGMLGSIDANRGDAQNGWDTDQFPIDNFELTQAMMEIIRNGGLGNGGTNFDAKIRRNSTDLEDLFIAHISGMDAMARALMNAADIMENSELPAMKKARYASFDAGIGKDFEDGKLSLEQVYEYGKKVEEPKQTSGKQEKYETIVALYAK 5586MI8_ Prevotella DNA  87ATGGCAAAAGAGTATTTCGCCTTTACAGGCAAGATTCCTTTCGAGGGAAAAGACAGTAAG 003AACGTGATGGCTTTCCACTACTACGAGCCGGAGCGTGTGGTGATGGGCAAGAAGATGAAGGAGTGGCTGAAGTTCGCCATGGCCTGGTGGCACACACTGGGTGGCGCATCGGCCGACCAGTTCGGAGGCCAGACACGCAGCTACGAGTGGGACAAGGCCGCCGACGCCGTGCAGCGCGCCAAGGACAAGATGGACGCCGGCTTCGAGATTATGGACAAGCTGGGCATCGAGTACTTCTGCTTCCACGATGTAGACCTCGTTGAGGAGGGTGAGACCATAGCCGAGTACGAGCGCCGCATGAAGGAAATCACCGACTACGCACAGGAGAAGATGAAGCAGTTCCCCAACATCAAGCTGCTCTGGGGCACAGCCAACGTGTTCGGCAACAAGCGCTACGCCAACGGCGCATCGACCAACCCCGACTTCGACGTTGTGGCACGCGCCATCGTGCAGATCAAGAACGCCATCGACGCCACCATCAAGCTCGGCGGCTCCAACTATGTGTTCTGGGGCGGACGCGAGGGCTATATGAGCCTGCTCAACACCGACCAGAAGCGCGAGAAGGAGCACATGGCCACCATGCTCACCATGGCCCGCGACTATGCACGCGCCAAGGGATTCAAGGGCACATTCCTCATCGAGCCGAAGCCCATGGAGCCCTCGAAGCACCAGTACGACGTAGACACAGAGACCGTCATCGGCTTCCTCCGTGCACACGGGCTGGACAAGGACTTCAAGGTGAACATCGAGGTAAACCACGCCACACTGGCCGGCCACACCTTCGAGCACGAGCTGGCTTGCGCCGTCGACGCTGGCATGCTGGGCAGCATCGACGCCAACCGTGGCGACGCACAGAACGGATGGGACACCGACCAGTTCCCCATCGACAACTTCGAGCTCACACAGGCCATGATGGAAATCATCCGCAATGGCGGACTGGGCAATGGCGGCACCAACTTCGACGCCAAGATCCGTCGTAACAGCACCGACCTCGAAGACCTCTTCATCGCCCACATCAGCGGCATGGACGCCATGGCACGCGCACTGCTCAACGCTGCCGACATCCTGGAGCACAGCGAGCTGCCCAAGATGAAGAAGGAGCGCTACGCCAGCTTCGACGCAGGCATCGGCAAGGACTTCGAAGACGGCAAGCTCACACTCGAGCAGGTCTACGAGTACGGCAAGAAGGTCGAAGAGCCCCGTCAGACCAGCGGCAAGCAGGAGAAGTACGAGACCATCGTCGCCCTCTATGCCAAGTAA 5586MI8_Prevotella Amino   88MAKEYFAFTGKIPFEGKDSKNVMAFHYTEPERVVMGKKMKEWLKFAMAWWHTLGGASADQ 003 AcidFGGQTRSYEWDKAADAVQRAKDKMDAGFEIMDKLGIEYFCFHDVDLVEEGETIAEYERRMKEITDYAQEKMKQFPNIKLLWGTANVFGNKRYANGASTNPDFDVVARAIVQIKNAIDATIKLGGSNYVFWGGREGYMSLLNTDQKREKEHMATMLTMARDYARAKGFKGTFLIEPKPMEPSKHQYDVDTETVIGFLRAHGLDKDFKVNIEVNHATLAGHTFEHELACAVDAGMLGSIDANRGDAQNGWDTDQFPIDNFELTQAMMEIIRNGGLGNGGTNFDAKIRRNSTDLEDLFIAHISGMDAMARALLNAADILEHSELPKMKKERYASFDAGIGKDFEDGKLTLEQVYEYGKKVEEPRQTSGKQEKYETIVALYAK 5586MI14_ Prevotella DNA  89ATGGCAAAAGAGTATTTTCCGTTTACTGGTAAGATTCCTTTCGAGGGAAAGGATAGTAAG 003AATGTAATGGCTTTCCACTATTACGAGCCCGAGAAAGTCGTGATGGGAAAGAAGATGAAGGACTGGCTGAAGTTCGCAATGGCTTGGTGGCATACACTGGGTGGTGCATCTGCAGACCAGTTCGGTGGAGAGACCCGCAGCTACGAGTGGAGCAAGGCTGCTGATCCCGTTCAGCGCGCCAAGGACAAGATGGACGCCGGCTTTGAGATTATGGATAAGCTGGGCATCGAGTACTTCTGTTTCCACGATATAGACCTCGTTCAGGAGGCAGATACCATTGCAGAATATGAGGAGCGCATGAAGGCAATTACCGACTATGCTCTGGAGAAGATGAAGCAGTTCCCCAACATCAAGTTGCTCTGGGGTACCGCTAACGTATTTAGCAACAAGCGCTATATGAACGGTGCTTCTACCAATCCCGACTTCGACGTGGTGGCCCGTGCCATCGTTCAGATCAAGAACGCTATTGATGCAACCATCAAACTCGGTGGTACCAACTATGTATTCTGGGGTGGTCGTGAGGGTTACATGAGCCTATTGAATACCGACCAGAAGCGTGAAAAGGAGCACATGGCAATGATGCTCGGTATGGCTCGCGACTATGCCCGCAGCAAGGGATTCAAGGGTACGTTCCTCATCGAGCCGAAGCCGATGGAGCCCTCTAAGCATCAGTATGATGTCGATACGGAGACTGTGATTGGTTTCCTGAAGGCACACGGTCTGGACAAGGACTTCAAGGTGAACATCGAGGTGAACCACGCTACACTGGCTGGTCATACCTTCGAGCATGAGCTGGCTTGCGCTGTTGACGCAGGTATGCTGGGCTCTATCGACGCTAACCGCGGTGATGCCCAGAACGGCTGGGATACCGACCAGTTCCCCATCGACAACTACGAGCTGACACAGGCTATGATGGAAATCATCCGCAACGGTGGTCTGGGCAATGGTGGTACCAACTICGACGCTAAGATCCGCCGTAACTCTACCGACCTCGAGGATCTGTTCATCGCTCATATCAGTGGTATGGATGCTATGGCCCGTGCTTTGTTGAATGCTGCCGACATTCTGGAGAACTCTGAACTGCCCGCTATGAAGAAGGCCCGCTACGCCAGCTTCGACAACGGTATCGGTAAGGACTTCGAGGATGGCAAGCTGACCTTCGAGCAGGTTTACGAATATGGTAAGAAAGTTGAAGAGCCGAAGCAGACCTCTGGCAAGCAGGAGAAATACGAGACCATCGTTGCTCTGTATGCTAAATAA 5586MI14_Prevotella Amino   90MAKEYFPFTGKIPFEGKDSKNVMAFHYYEPEKVVMGKKMKDWLKFAMAWWHTLGGASADQ 003 AcidFGGETRSYEWSKAADPVQRAKDKMDAGFEIMDKLGIEYFCFHDIDLVQEADTIAEYEERMKAITDYALEKMKQFPNIKLLWGTANVFSNKRYMNGASTNPDFDVVARAIVQIKNAIDATIKLGGTNYVFWGGREGYMSLLNTDQKREKEHMAMMLGMARDYARSKGFKGTFLIEPKPMEPSKHQYDVDTETVIGFLKAHGLDKDFKVNIEVNHATLAGHTFEHELACAVDAGMLGSIDANRGDAQNGWDTDQFPIDNYELTQAMMEIIRNGGLGNGGTNFDAKIRRNSTDLEDLFIAHISGMDAMARALLNAADILENSELPAMKKARYASFDNGIGKDFEDGKLTFEQVYEYGKKVEEPKQTSGKQEKYETIVALYAK 5586MI26_ Prevotella DNA  91ATGGCAAAAGAGTATTTTCCGTTTACCGGTAAAATTCCTTTCGAGGGAAAGGACAGTAAG 003AATGTAATGGCTTTCCACTACTACGAGCCTGAGCGCGTAGTGATGGGAAAGAAGATGAAGGATTGGTTGCGATTTGCAATGGCTTGGTGGCACACACTGGGTGGCGCTTCTGCCGACCAGTTTGGTGGTCAGACCCGCAGTTACGAATGGGACAAGGCTGCTGATGCTGTICAGCGTGCTAAGGACAAGATGGATGCCGGCTTCGAGATTATGGATAAGCTGGGAATCGAGTICTTCTGCTGGCACGATATCGACCTCGTTGAAGAGGGTGAGACCATTGAAGAGTATGAGCGCCGCATGAAGGCTATCACCGACTATGCTCTTGAGAAGATGCAGCAGTATCCCAACATCAAGAACCTCTGGGGAACAGCCAATGTGTTTGGCAACAAGCGTTATGCCAACGGTGCCAGCACAAACCCAGACTTTGACGTCGTTGCTCGTGCTATCGTACAGATTAAGAATGCTATCGACGCTACTATCAAGTTGGGTGGTCAGAATTATGTGTTCTGGGGTGGCCGTGAGGGCTACATGAGCCTGCTCAATACTGACCAGAAGCGTGAGAAGGAGCACATGGCTACAATGCTGACCATGGCACGCGACTATGCCCGCAGCAAGGGATTCAAGGGTAACTTCCTCATTGAGCCCAAGCCCATGGAGCCGTCAAAGCACCAGTATGATGTTGACACCGAGACCGTATGCGGTTTCCTGCGTGCCCACAACCTTGACAAGGATTTCAAGGTAAATATCGAGGTTAACCATGCTACTCTGGCTGGTCATACTTTCGAGCACGAACTGGCATGCGCTGTTGACGCTGGTATGCTTGGTTCTATCGATGCTAACCGTGGTGATGCCCAGAATGGCTGGGATACCGACCAGTTCCCCATCAACAACTATGAACTCACTCAGGCTATGCTTGAGATCATCCGTAATGGTGGTCTGGGTCTTGGCGGCACAAACTTCGATGCCAAGATTCGTCGTAACTCAACAGATCTTGAGGATCTCTTCATCGCTCACATCAGTGGTATGGATGCCATGGCCCGTGCTCTGCTGAATGCTGCTGCTATTCTGGAGGAGAGCGAGCTGCCTAAGATGAAGAAGGAGCGTTATGCTTCTTTCGATGCCGGTATCGGTAAGGACTTCGAGGATGGCAAGCTTACCCTTGAGCAGGCTTACGAGTATGGTAAGAAGGTTGAGGAGCCCAAGCAGACTTCAGGCAAGCAGGAGAAGTACGAGACCATCGTTGCTCTGTATGCAAAATAA 5586MI26_Prevotella Amino   92MAKEYFPFTGKIPFEGKDSKNVMAFHYYEPERVVMGKKMKDWLRFAMAWWHTLGGASADQ 003 AcidFGGQTRSYEWDKAADAVQRAKDKMDAGFEIMDKLGIEFFCWHDIDLVEEGETIEEYERRMKAITDYALEKMQQYPNIKNLWGTANVFGNKRYANGASTNPDFDVVARAIVQIKNAIDATIKLGGQNYVFWGGREGYMSLLNTDQKREKEHMATMLTMARDYARSKGFKGNFLIEPKPMEPSKHQYDVDTETVCGFLRAHNLDKDFKVNIEVNHATLAGHTFEHELACAVDAGMLGSIDANRGDAQNGWDTDQFPINNYELTQAMLEIIRNGGLGLGGTNFDAKIRRNSTDLEDLFIAHISGMDAMARALLNAAAILEESELPKMKKERYASFDAGIGKDFEDGKLTLEQAYEYGKKVEEPKQTSGKQEKYETIVALYAK 5586MI86_ Prevotella DNA  93ATGAAACAGTATTTTCCCCAGATTGGAAAGATACCCTTCGAGGGTGTAGAGAGCAAGAAT 001GTGATGGCTTTCCACTATTATGAGCCAGAAAGAGTAGTCATGGGCAAGCCTATGAAAGAATGGCTGCGCTTCGCTATGGCGTGGTGGCACACGCTGGGGCAGGCGAGCGGCGACCCCTTCGGCGGACAGACCCGCAGCTACGAGTGGGACCGTGCGGCCGACGCGCTACAGCGCGCCAAGGACAAGATGGATGCGGGCTTCGAGCTGATGGAGAAGCTTGGCATTGAGTACTTCTGCTTCCACGACGTGGACCTCGTAGAAGAGGGCGCCACGGTGGAGGAATACGAGCGGCGGATGGCTGCCATCACCGACTACGCGGTAGAGAAGATGCGCGAGCATCCCGAGATACACTGCCTGTGGGGCACGGCCAATGTCTTCGGCCACAAGCGCTACATGAACGGAGCCGCCACCAACCCCGACTTCGACGTGGTGGCGCGTGCGGTGGTGCAGATAAAGAACAGCATCGACGCCACGATCAAGCTGGGCGGCGAGAACTATGTGTTCTGGGGCGGACGCGAGGGATATATGAGCCTGCTCAACACCGACCAGCGCCGCGAGAAGGAGCACCTGGCCATGATGCTTGCGAAGGCCCGCGACTATGGCCGCGCCCACGGCTTCAAGGGCACCTTCCTGATAGAGCCCAAGCCGATGGAGCCCATGAAGCACCAGTACGACGTGGACACCGAGACGGTGATAGGTTTCCTGCGTGCCCACGGACTGGACAAGGACTTCAAGGTGAACATCGAGGTGAACCACGCCACGTTGGCGGGCCACACGTTCGAGCACGAGCTGGCCTGTGCCGTCGATGCCGGCATGCTGGGCAGCATCGACGCCAACCGTGGCGACGCGCAGAACGGATGGGATACGGACCAGTTCCCCATAGACTGCTACGAGCTCACGCAGGCGTGGATGGAGATCATTCGTGGCGGCGGCTTCACCACCGGCGGCACCAACTTCGACGCTAAGCTGCGCCGCAACTCGACCGACCCCGAGGATATCTTCATAGCTCACATCAGCGGCATGGATGCTATGGCCCGCGCCCTGCTCTGCGCCGCCGACATCTTGGAGCACAGCGAGCTGCCGGAGATGAAGCGGAAGCGCTATGCCTCGTTCGACAGCGGCATGGGCAAGGAGTTCGAAGAGGGCAATCTCAGCTTCGAGCAAATCTATGCCTACGGCAAGCAGGCGGGCGAACCGGCCACGACCAGCGGCAAGCAGGAGAAATACGAAGCCATTGTTTCACTTTATACCCGATGA 5586MI86_Prevotella Amino   94MKQYFPQIGKIPFEGVESKNVMAFHYYEPERVVMGKPMKEWLRFAMAWWHTLGQASGDPF 001 AcidGGQTRSYEWDRAADALQRAKDKMDAGFELMEKLGIEYFCFHDVDINEEGATVEEYERRMAAITDYAVEKMREHPEIHCLWGTANVFGHKRYMNGAATNPDFDVVARAVVQIKNSIDATIKLGGENYVFWGGREGYMSLLNTDQRREKEHLAMMLAKARDYGRAHGFKGTFLIEPKPMEPMKHQYDVDTETVIGFLRAHGLDKDFKVNIEVNHATLAGHTFEHELACAVDAGMLGSIDANRGDAQNGWDTDQFPIDCYELTQAWMEIIRGGGFTTGGTNFDAKLRRNSTDPEDIFIAHISGMDAMARALLCAADILEESELPEMKRKRYASFDSGMGKEFEEGNLSFEQIYAYGKQAGEPATTSGKQEKYEAIVSLYTR 5586MI108_ Prevotella DNA  95ATGGCAAAAGAGTATTTTCCGTTTATCGGTAAGGTTCCTTTCGAAGGAACAGAGAGCAAG 002AACGTGATGGCATTCCACTACTATGAGCCCGAAAAGGTGGTCATGGGTAAGAAAATGAAGGACTGGCTGAAGTTCGCTATGGCTTGGTGGCACACACTGGGTGGTGCCAGCGCCGACCAGTTTGGTGGTCAGACTCGCAGCTACGAGTGGGACAAGGCTGCTGATGCCGTTCAGCGCGCCAAGGACAAGATGGATGCTGGCTTCGAGATCATGGATAAGCTCGGCATTGAGTACTTCTGCTTCCATGACGTAGACCTCGTTGAGGAGGGTGAAACCGTCGCTGAGTATGAGGCTCGCATGAAGGTCATCACCGACTATGCCCTGGAGAAGATGCAGCAGTTCCCCAACATCAAACTGCTCTGGGGTACTGCTAACGTGTTCGGCCACAAGCGCTATGCCAACGGTGCCAGCACCAATCCCGACTTCGACGTCGTGGCCCGTGCTATCGTTCAGATCAAGAATGCCATCGATGCTACCATTAAGCTCGGCGGTACGAACTATGTGTTCTGGGGTGGTCGTGAGGGCTACATGAGCCTTCTCAACACCGACCAGAAGCGCGAGAAGGAGCACATGGCAACGATGCTGACCATGGCTCGCGACTATGCCCGCGCCAAGGGATTCAAGGGCACGTTCCTCATCGAGCCGAAGCCCATGGAGCCCTCGAAGCATCAGTACGACGTCGACACCGAGACCGTCATCGGCTTCCTCCGTGCCCACGGTCTGGATAAGGACTTCAAGGTGAACATCGAGGTGAACCACGCCACGCTGGCCGGTCATACCTTCGAGCACGAACTGGCTTGCGCCGTTGATGCCGGCATGCTCGGCTCTATCGATGCCAACCGCGGCGACGCTCAGAACGGCTGGGACACCGACCAGTTCCCCATCGACAACTACGAGCTCACTCAGGCCATGATGGAAATCATCCGTAATGGCGGTCTGGGCAACGGCGGCACGAACTTCGATGCCAAGATCCGTCGTAACAGCACCGACCTCGAGGACCTCTTCATCGCTCACATCAGCGGCATGGATGCCATGGCACGCGCTCTGATGAACGCTGCTGCCATCCTCGAAGAGAGCGAGCTGCCCGCCATGAAGAAGGCCCGCTATGCTTCGTTCGACGAGGGTATCGGCAAGGACTTCGAGGACGGCAAGTTGTCACTTGAGCAGGTCTACGAATATGGTAAGAAGGTTGAGGAGCCCAAGCAGACCTCGGGCAAGCAGGAGAAGTACGAGACCATCGTGGCCCTCTATGCCAAGTAA 5586MI108_Prevotella Amino   96MAKEYFPFIGKVPFEGTESKNVMAFHYYEPEKVVMGKKMKDWLKFAMAWWHTLGGASADQ 002 AcidFGGQTRSYEWDKAADAVQRAKDKMDAGFEIMDKLGIEYFCFHDVDLVEEGETVAEYEARMKVITDYALEKMQQFPNIKLLWGTANVFGHKRYANGASTNPDFDVVARAIVQIKNAIDATIKLGGTNYVFWGGREGYMSLLNTDQKREKEHMATMLTMARDYARAKGFKGTFLIEPKPMEPSKHQYDVDTETVIGFLRAHGLDKDFKVNIEVNHATLAGHTFEHELACAVDAGMLGSIDANRGDAQNGWDTDQFPIDNYELTQAMMEIIRNGGLGNGGTNFDAKIRRNSTDLEDLFIAHISGMDAMARALMNAAAILEESELPAMKKARYASFDEGIGKDFEDGKLSLEQVYEYGKKVEEPKQTSGKQEKYETIVALYAK 5586MI182_ Prevotella DNA  97ATGGCAAAAGAGTATTTTCCGTTTGTTGGTAAGATTCCTTTCGAGGGAAAGGATAGTAAG 004AATGTAATGGCTTTCCACTATTACGAACCAGAGAAGGTCGTGATGGGAAAGAAGATGAAGGACTGGCTGAAGTTCGCCATGGCATGGTGGCACACACTGGGACAGGCCAGTGCCGACCCGTTTGGAGGTCAGACCCGCAGCTACGAGTGGGACAAGGCTGACGATGCTGTGCAGCGCGCAAAGGACAAGATGGATGCCGGATTTGAGATCATGGACAAGCTGGGCATCGAGTACTTCTGCTTCCACGATGTAGACCTCGTTGAGGAGGGAGCAACTGTTGAGGAGTACGAGGCTCGCATGAAGGCCATCACCGACTATGCATTGGAGAAGATGAAAGAGTATCCCAACATCAAGAACCTCTGGGGTACAGCCAATGTATTCAGCAACAAGCGCTATATGAACGGTGCCAGCACCAACCCCGACTTCGACGTTGTTGCACGTGCCATCGTACAGATAAAGAACGCCATTGACGCTACCATCAAGCTCGGCGGTCAGAACTACGTGTTCTGGGGCGGACGTGAGGGATACATGAGCCTGCTCAACACCGACCAGAAGCGCGAGAAGGAGCACATGGCAACCATGCTGACCATGGCTCGCGACTACGCTCGCAAGAACGGTTTCAAGGGCACATTCCTCATCGAGCCTAAGCCCATGGAACCCTCAAAGCACCAGTACGACGTAGACACAGAGACCGTATGCGGTTTCCTCCGCGCCCATGGTCTTGACAAGGATTTCAAGGTGAACATTGAGGTGAACCACGCTACCCTCGCCGGCCACACCTTTGAGCATGAACTGGCTTGCGCCGTCGACAACGGCATGCTCGGCAGCATCGATGCCAACCGCGGCGACGTTCAGAACGGCTGGGACACCGACCAGTTCCCCATCGACAACTACGAGCTGACTCAGGCCATGCTCGAAATCATCCGCAACGGTGGTCTGGGCAACGGCGGTACCAACTTCGACGCCAAGATCCGTCGTAACTCTACCGACCTCGAGGATCTGTTCATCGCCCACATCAGCGGTATGGACGCCATGGCACGTGCACTGCTCAATGCAGCAGCCATACTGGAGGAGAGCGAGCTGCCTGCCATGAAGAAGGAGCGTTACGCCAGCTTCGACAGCGGCATCGGCAAGGACTTCGAGGACGGCAAGCTCACACTTGAGCAGGCCTATGAGTATGGTAAGAAGGTTGAGGAGCCAAAGCAGACCTCTGGCAAGCAGGAGAAGTATGAGACTATAGTAGCCCTCTACGCTAAGTAG 5586MI182_Prevotella Amino   98MAKEYFPFVGKIPFEGKDSKNVMAFHYYEPEKVVMGKKMKDWLKFAMAWWHTLGQASADP 004 AcidFGGQTRSYEWDKADDAVQRAKDKMDAGFEIMDKLGIEYFCFHDVDLVEEGATVEEYEARMKAITDYALEKMKEYPNIKNLWGTANVFSNKRYMNGASTNPDFDVVARAIVQIKNAIDATIKLGGQNYVFWGGREGYMSLLNTDQKREKEHMATMLTMARDYARKNGFKGTFLIEPKPMEPSKHQYDVDTETVCGFLRAHGLDKDFKVNIEVNHATLAGHTFEHELACAVDNGMLGSIDANRGDVQNGWDTDQFPIDNYELTQAMLEIIRNGGLGNGGTNFDAKIRRNSTDLEDLFIAHISGMDAMARALLNAAAILEESELPAMKKERYASFDSGIGKDFEDGKLTLEQAYEYGKKVEEPKQTSGKQEKYETIVALYAK 5586MI193_ Prevotella DNA  99ATGACTAAAGAGTATTTCCCTACCATTGGCAAGATTCCCTTTGAGGGACCTGAAAGCAAG 004AACCCGCTTGCATTCCATTACTATGAGCCCGACCGCCTGGTCATGGGCAAGAAGATGAAAGACTGGCTGCGTTTCGCCATGGCCTGGTGGCACACCCTGGGCCAGGCCTCCGGCGACCAGTTCGGCGGCCAGACCCGCCACTATGCCTGGGATGATCCGGATTGCCCGTATGCACGTGCCAAAGCCAAGGCCGACGCCGGTTTCGAAATCATGCAGAAACTGGGCATTGAATTCTTCTGCTTCCACGACATCGACCTGGTCGAGGATGCCGATGAAATCGCCGAGTACGAGGCCCGGATGAAGGACATCACCGACTATCTGCTCGTCAAGATGAAAGAGACCGGCATCAAGAACCTTTGGGGAACGGCCAACGTATTTGGCCACAAGCGCTACATGAACGGCGCCGCCACCAACCCCGATTTCGACGTGCTGGCCCGTGCCGCCGTCCAGATCAAGAACGCCATCGACGCCACCATCAAGTTGGGCGGTCAGAACTATGTGTTCTGGGGCGGCCGTGAAGGCTACCAGACCCTGCTCAATACCCAGATGCAGCGCGAGAAGGAACACATGGGCCGTATGTTGGCACTGGCCCGCGACTATGGCCGTGCACACGGTTTCAAGGGCACGTTCCTCATCGAGCCCAAACCGATGGAGCCGACCAAGCACCAGTACGATCAGGATACGGAAACCGTCATCGGCTTCCTGCGCCGCCATGGCCTCGACAAGGACTTCAAGGTCAACATCGAGGTGAACCATGCTACCCTGGCGGGCCACACCTTCGAGCACGAGCTGGCTTGCGCCGTCGACCACGGCATGCTGGGCAGCATCGACGCCAACCGGGGTGATGCCCAGAACGGCTGGGACACCGACCAGTTCCCGATCGATAACTATGAGCTGACGCTGGCCATGCTCCAGATCATCCGCAACGGCGGCCTGGCACCCGGCGGCTCGAACTTCGATGCGAAGCTGCGTCGCAACTCCACCGATCCGGAAGATATCTTCATCGCGCACATCAGCGCCATGGATGCCATGGCCCGCGCCCTGGTCAATGCTGTCGCCATTCTCGAGGAATCGCCCATCCCGGCCATGGTCAGGGAACGTTACGCCTCDTTCGACAGCGGAAAGGGCAGGGAATATGAGGAAGGCAGGCTGTCTCTCGAAGACATCGTGGCCTATGCCAAAGCCCACGGCGAACCGAAACAGATTTCCGGCAAGCAGGAACTCTACGAAACCATCGTGGCTCTCTATTGCAAGTAG 5586MI193_Prevotella Amino  100MTKEYFPTIGKIPFEGPESKNPLAFHYYEPDRLVMGKKMKDWLRFAMAWWHTLGQASGDQ 004 AcidFGGQTRHYAWDDPDCPYARAKAKADAGFEIMQKLGIEFFCFHDIDLVEDADEIAEYEARMKDITDYLLVKMKETGIKNLWGTANVFGHKRYMNGAATNPDFDVLARAAVQIKNAIDATIKLGGQNYVFWGGREGYQTLLNTQMQREKEHMGRMLALARDYGRAHGFKGTFLIEPKPMEPTKHQYDQDTETVIGFLRRHGLDKDFKVNIEVNHATLAGHTFEHELACAVDHGMLGSIDANRGDAQNGWDTDQFPIDNYELTLAMLQIIRNGGLAPGGSNFDAKLRRNSTDPEDIFIAHISAMDAMARALVNAVAILEESPIPAMVRERYASFDSGKGREYEEGRLSLEDIVAYAKAHGEPKQISGKQELYETIVALYCK 5586MI195_ Prevotella DNA 101ATGGCAAAAGAGTATTTCCCGCAGATCGGAAAGATCGGCTTTGAGGGTCCTGCAAGCAAG 003AACCCGCTGGCATTCCATTATTATGACGCCGAGCGCGTGGTGATGGGTAAACCCATGAAAGACTGGTTTAAATTCGCCCTCGCGTGGTGGCACAGCCTCGGCCAGGCCTCCGGCGACCCGTTCGGCGGCCAGACCCGCTCCTACGAGTGGGACAAGGGCGAATGCCCCTACTGCCGCGCCCGCGCCAAGGCGGACGCCGGCTTCGAGATCATGCAAAAGCTCGGCATCGGCTATTTCTGCTTCCACGACGTCGACCTCATCGAAGACACGGACGACATCGCCGAATATGAGGCCCGCCTCAAGGACATCACGGACTACCTGCTCGAAAGGATGCAGGAAACCGGCATCAAGAACCTCTGGGGCACGGCCAATGTCTTCGGTCACAAGCGCTACATGAACGGCGCCGGCACCAATCCGCAGTTCGACATCGTCGCCCGCGCTGCCGTCCAGATCAAGAACGCCCTCGACGCCACCATCAAGCTCGGTGGCTCGAACTACGTCTTCTGGGGCGGCCGCGAAGGTTATTACACGCTGCTCAACACCCAGATGCAGCGCGAGAAAGACCACCTCGCCAAGCTCCTCACCGCCGCCCGCGACTATGCCCGCGCCAAGGGCTTCCAGGGCACCTTCCTGATCGAGCCCAAGCCGATGGAGCCGACCAAGCACCAGTACGATGTCGACACGGAGACTGTAATCGGATTCCTCCGCGCCAACGGACTGGACAAGGACTTCAAGGTCAACATCGAGGTCAACCACGCCACCCTCGCCGGCCATACCTTCGAGCATGAGCTGACCGTCGCCCGCGAGAACGGATTCCTCGGCAGCATCGACGCCAACCGCGGTGACGCCCAGAACGGCTGGGACACCGACCAGTTCCCCGTGGACGCCTACGACCTCACCCAGGCCATGATGCAGGTGCTCCTGAACGGCGGTTTCGGCAACGGCGGCACCAATTTCGACGCCAAGCTCCGTCGCAGCTCCACCGATCCCGAGGACATCTTCATCGCCCACATCAGCGCGATGGACGCCATGGCCCACGCCCTGCTGAACGCCGCGGCCATTCTCGAGGAGAGCCCGCTGCCCGCGATGGTCAAGGAGCGTTACGCCTCCTTCGACAGCGGTCTCGGCAAGCAGTTCGAGGAGGGAAAGGCCACGCTGGAGGACCTCTACGACTACGCCAAGGCCCATGGCGAGCCCGTCGCCGCCTCCGGCAAGCAGGAACTGTGTGAAACTTACCTGAATCTGTATGCAAAGTAA 5586MI195_Prevotella Amino  102MAKEYFPQIGKIGFEGPASKNPLAFHYYDAERVVMGKPMKDWFKFALAWWHSLGQASGDP 003 AcidFGGQTRSYEWDKGECPYCRARAKADAGFEIMQKLGIGYFCFHDVDLIEDTDDIAEYEARLKDITDYLLERMQETGIKNLWGTANVFGHKRYMNGAGTNPQFDIVARAAVQIKNALDATIKLGGSNYVFWGGREGYYTLLNTQMQREKDHLAKLLTAARDYARAKGFQGTFLIEPKPMEPTKHQYDVDTETVIGFLRANGLDKDFKVNIEVNHATLAGHTFEHELTVARENGFLGSIDANRGDAQNGWDTDQFPVDAYDLTQAMMQVLLNGGFGNGGTNFDAKLRRSSTDPEDIFIAHISAMDAMAHALLNAAAILEESPLPAMVKERYASFDSGLGKQFEEGKATLEDLYDYAKAHGEPVAASGKQELCETYLNLYAK 5586MI196_ Prevotella DNA 103ATGACAAAAGAGTATTTCCCTACCATCGGCAAGATCCCCTTTGAGGGACCCGAGAGCAAA 003AACCCCCTCGCTTTTCATTACTATGAGCCCGACCGCCTGGTCATGGGCAAGAAGATGAAAGACTGGCTGCGTTTCGCCATGGCCTGGTGGCACACCCTGGGCCAGGCCTCCGGCGACCAGTTTGGCGGCCAGACCCGCCACTATGCCTGGGATGATCCGGATTGCCCGTATGCACGTGCCAAAGCCAAGGCCGACGCCGGTTTCGAAATCATGCAGAAACTGGGCATTGAATTCTTCTGCTTCCACGACATCGACCTGATCGAGGATACCGATGACATCGTCGAGTATGAGGCCCGGATGAAGGACATCACCGACTATCTGCTGGTCAAGATGAAAGAGACCGGCATCAAGAATCTCTGGGGAACGGCCAACGTATTCGGGCACAAGCGCTATATGAACGGCGCTGCCACCAACCCCGATTTCGACGTGCTGGCCCGTGCCGCCGTCCAGATCAAGAACGCCATCGACGCCACCATCAAGCTGGGCGGCCAGAATTATGTGTTCTGGGGCGGGCGTGAAGGCTACCAGAGCCTGCTCAATACCCAGATGCAGCGCGAAAAGGAACACATGGGCCGTATGTTGGCACTAGCCCGCGACTATGGCCGTGCACACGGTTTCAAGGGCACGTTCCTCATCGAGCCCAAACCGATGGAGCCGACCAAGCACCAGTACGATCAGGATACGGAGACCGTCATCGGTTTTCTGCGCCGCCATGGCCTCGACAAGGACTTCAAGGTCAACATCGAGGTGAACCATGCTACCCTGGCGGGCCACACCTTCGAGCACGAGCTGGCCTGCGCCGTCGACCACGGCATGCTGGGCAGTATTGACGCCAACCGCGGTGACGCCCAGAACGGCTGGGACACCGACCAGTTCCCGATCGATAACTATGAGCTGACGCTGGCCATGCTCCAGATCATCCGCAACGGCGGCCTGGCACCCGGCGGCTCGAACTTCGATGCGAAGCTGCGTCGCAACTCCACCGATCCGGAAGATATCTTCATCGCGCACATCAGCGCCATGGATGCCATGGCCCGCGCCCTGGTCAACGCTGTCGCCATTCTTGAGGAATCGCCCATTCCGGACATGGTCAAGGAGCGCTACGCTTCGTTCGACAGCGGAAAAGGCAGGGAGTACGAAGAGGGGAAACTTTCCTTCGAGGACCTCGTGGCCTATGCCAAAGCCCACGGCGAACCGAAACAGATTTCCGGCAAGCAGGAACTCTACGAAACCATCGTGGCTCTCTATTGCAAGTAG 5586MI196_Prevotella Amino  104MTKEYFPTIGKIPFEGPESKNPLAFHYYEPDRLVMGKKMKDWLRFAMAWWHTLGQASGDQ 003 AcidFGGQTRHYAWDDPDCDTARAKAKADAGFEIMQKLGIEFFCFHDIDLIEDTDDIVEYEARMKDITDYLLVKMKETGIKNLWGTANVFGHKRYMNGAATNPDFDVLARAAVQIKNAIDATIKLGGQNYVFWGGREGYQSLLNTQMQREKEHMGRMLALARDYGRAHGFKGTFLTEPKPMEPTKHQYDQDTETVIGFLRRHGLDKDFKVNIEVNHATLAGHTFEHELACAVDHGMLGSIDANRGDAQNGWDTDQFPIDNYELTLAMLQIIRNGGLAPGGSNFDAKLRRNSTDPEDTFIAHISAMDAMARALVNAVAILEESPIPDMVKERYASFDSGKGREYEEGKLSFEDLVAYAKAHGEPKQISGKQELYETIVALYCK 5586MI197_ Prevotella DNA 105ATGACAAAAGAGTATTTCCCTACCATCGGCAAGATCCCCTTTGAGGGACCCGAGAGCAAA 003AACCCCCTCGCTTTTCATTACTATGAGCCCGACCGCCTGGTCATGGGCAAGAAGATGAAAGACTGGCTGCGTTTCGCCATGGCCTGGTGGCACACCCTGGGCCAGGCCTCCGGCGACCAGTTTGGCGGCCAGACCCGCCACTATGCCTGGGATGATCCGGATTGCCCGTATGCACGTGCCAAAGCCAAGGCCGACGCCGGTTTCGAAATCATGCAGAAACTGGGCATTGAATTCTTCTGCTTCCACGACATCGACCTGATCGAGGATACCGATGACATCGTCGAGTATGAGGCCCGGATGAAGGACATCACCGACTATCTGCTGGTCAAGATGAAAGAGACCGGCATCAAGAATCTCTGGGGAACGGCCAACGTATTCGGGCACAAGCGCTATATGAACGGCGCTGCCACCAACCCCGATTTCGACGTGCTGGCCCGTGCCGCCGCCCAGATCAAGAACGCCATCGACGCCACCATCAAGCTGGGCGGCCAGAATTATGTGTTCTGGGGCGGGCGTGAAGGCTACCAGAGCCTGCTCAATACCCAGATGCAGCGCGAAAAGGAACACATGGGCCGTATGTTGGCACTAGCCCGCGACTATGGCCGTGCACACGGTTTCAAGGGCACGCTCCTCATCGAGCCCAAACCGATGGAGCCGACCAAGCACCAGTACGATCAGGATACGGAGACCGTCATCGGTTTTCTGCGCCGCCATGGCCTCGACAAGGACTTCAAGGTCAACATCGAGGTGAACCATGCTACCCTGGCGGGCCACACCTTCGAGCACGAGCTGGCCTGCGCCGTCGACCACGGCATGCTGGGCAGTATTGACGCCAACCGCGGTGACGCCCAGGACGGCTGGGACACCGACCAGTTCCCGATCGATAACTATGAGCTGACGCTGGCCATGCTCCAGATCATCCGCAACGGCGGCCTGGCACCCGGCGGCTCGAACTTCGATGCGAAGCTGCGTCGCAACTCCACCGATCCGGAAGATATCTTCATCGCGCACATCAGCGCCATGGATGCCATGGCCCGCGCCCTGGTCAACGCTGTCGCCATTCTTGAGGAATCGCCCATTCCGGACATGGTCAAGGAGCGCTACGCTTCGTTCGACAGCGGAAAAGGCAGGGAGTACGAAGAGGGGAAACTTTCCTTCGAGGACCTCGTGGCCTATGCCAAAGCCCACGGCGAACCGAAACAGATTTCCGGCAAGCAGGAACTCTACGAAACCATCGTGGCTCTCTATTGCAAGTAG 5586MI197_Prevotella Amino  106MTKEYFPTIGKIPFEGPESKNPLAFHYYEPDRLVMGKKMKDWLRFAMAWWHTLGQASGDQ 003 AcidFGGQTRHYAWDDPDCPYARAKAKADAGFEIMQKLGIEFFCFHDIDLIEDTDDIVEYEARMKDITDYLLVKMKETGIKNLWGTANVFGHKRYMNGAATNPDFDVLARAAAQIKNAIDATIKLGGQNYVFWGGREGYQSLLNTQMQREKEHMGRMLALARDYGRAHGFKGTLLIEPKPMEPTKHQYDQDTETVIGFLRRHGLDKDFKVNIEVNHATLAGHTFEHELACAVDHGMLGSIDANRGDAQDGWDTDQFPIDNYELTLAMLQIIRNGGLADGGSNFDAKLRRNSTDPEDIFIAHISAMDAMARALVNAVAILEESPIPDMVKERYASFDSGKGREYEEGKLSFEDLVAYAKAHGEPKQISGKQELYETIVALYCK 5586MI199_ Prevotella DNA 107ATGACAAAAGAGTATTTCCCTACCATCGGCAAGATCCCCTTTGAGGGACCCGAGAGCAAA 003AACCCCCTCGCTTTTCATTACTATGAGCCCGACCGCCTGGTCATGGGCAAGAAGATGAAAGACTGGCTGCGTTTCGCCATGGCCTGGTGGCACACCCTGGGCCAGGCCTCCGGCGACCAGTTTGGCGGCCAGACCCGCCACTATGCCTGGGATGATCCGGATTGCCCGTATGCACGTGCCAAAGCCAAGGCCGACGCCGGTTTCGAAATCATGCAGAAACTGGGCATTGAATTCTTCTGCTTCCACGACATCGACCTGATCGAGGATACCGATGACATCGTCGAGTATGAGGCCCGGATGAAGGACATCACCGACTATCTGCTGGTCAAGATGAAAGAGACCGGCATCAAGAATCTCTGGGGAACGGCCAACGTATTCGGGCACAAGCGCTATATGAACGGCGCTGCCACCAACCCCGATTTCGACGTGCTGGCCCGTGCCGCCGTCCAGATCAAGAACGCCATCGACGCCACCATCAAGCTGGGCGGCCAGAATTATGTGTTCTGGGGCGGGCGTGAAGGCTACCAGAGCCTGCTCAATACCCAGATGCAGCGCGAAAAGGAACACATGGGCCGTATGTTGGCACTAGCCCGCGACTATGGCCGTGCACACGGTTTCAAGGGCACGTTCCTCATCGAGCCCAAACCGATGGAGCCGACCAAGCACCAGTACGATCAGGATACGGAGACCGTCATCGGTTTTCTGCGCCGCCATGGCCTCGACAAGGACTTCAAGGTCAACATCGAGGTGAACCATGCTACCCTGGCGGGCCACACCTTCGAGCACGAGCTGGCCTGCGCCGTCGACCACGGCATGCTGGGCAGTATTGACGCCAACCGCGGTGACGCCCAGAACGGCTGGGACACCGACCAGTTCCCGATCGATAACTATGAGCTGACGCTGGCCATGCTCCAGATCATCCGCAACGGCGGCCTGGCACCCGGCGGCTCGAACTTCGATGCGAAGCTGCGTCGCAACTCCACCGATCCGGAAGATGTCTTCATCGCGCACATCAGCGCCATGGATGCCATGGCCCGCGCCCTGGTCAACGCTGTCGCCATTCTTGAGGAATCGCCCATTCCGGACATGGTCAAGGAGCGCTACGCTTCGTTCGACAGCGGAAAAGGCAGGGAGTACGAAGAGGGGAAACTTTCCTTCGAGGACCTCGTGGCCTATGCCAAAGCCCACGGCGAACCGAAACAGATTTCCGGCAAGCAGGAACTCTACGAAACCATCGTGGCTCTCTATTGCAAGTAG 5586MI199_Prevotella Amino  108MTKEYFPTIGKIPFEGPESKNPLAFHYYEPDRLVMGKKMKDWLRFAMAWWHTLGQASGDQ 003 AcidFGGQTRHYAWDDPDCPYARAKAKADAGFEIMQKLGIEFFCFHDIDLIEDTDDIVEYEARMKDITDYLLVKMKETGIKNLWGTANVFGHKRYMNGAATNPDFDVLARAAVQIKNAIDATIKLGGQNYVFWGGREGYQSLLNTQMQREKEHMGRMLALARDYGRAHGFKGTFLIEPKPMEPTKHQYDQDTETVIGFLRRHGLDKDFKVNIEVNHATLAGHTFEHELACAVDHGMLGSIDANRGDAQNGWDTDQFPIDNYELTLAMLQIIRNGGLAPGGSNFDAKLRRNSTDPEDVFIAHISAMDAMARALVNAVAILEESPIPDMVKERYASFDSGKGREYEEGKLSFEDLVAYAKAHGEPKQISGKQELYETIVALYCK 5586MI200_ Prevotella DNA 109ATGGCAAAAGAGTATTTCCCGACAATCGGAAAGATCCCCTTCGAGGGCGTTGAGAGCAAG 003AATCCCCTTGCTTTCCATTATTATGACGCCGAGCGCGTGGTCATGGGCAAGCCCATGAAGGACTGGTTCAAGTTCGCGATGGCCTGGTGGCACACCCTGGGCCAGGCTTCCGCGGACCCGTTCGGCGGCCAGACCCGCTCCTACGAGTGGGACAAGGGCGAGTGCCCCTACTGCCGCGCCCGCGCCAAGGCTGACGCCGGCTTCGAGATCATGCAGAAGCTCGGAATCGGCTACTATTGCTTCCACGACATCGACCTGGTGGAGGACACCGAGGACATCGCCGAATACGAGGCCCGCATGAAGGACATCACCGACTACCTCGTCGAGAAGCAGAAGGAGACCGGCATCAAGAACCTCTGGGGCACCGCGAACGTGTTCGGCAACAAGCGCTACATGAACGGCGCCGCCACGAACCCGCAGTTCGACATCGTCGCCCGCGCGGCCCTGCAGATCAAGAACGCGATCGATGCCACCATCAAGCTCGGCGGCACCGGCTACGTGTTCTGGGGCGGCCGGGAAGGCTACTACACCCTGCTGAACACCCAGATGCAGCGCGAGAAGGACCACCTCGCCAAGATGCTCACCGCCGCCCGCGACTACGCCCGCGCCAACGGCTTCAAGGGCACCTTCCTCATCGAGCCCAAGCCGATGGAGCCCACCAAGCACCAATACGACGTGGACACGGAGACCGTGATCGGCTTCCTCCGCGCCAATGGCCTGGACAAGGACTTCAAGGTGAACATCGAGGTGAACCACGCCACCCTCGCCGGCCACACCTTCGAGCACGAGCTCACCGTGGCCGTTGACAACGGCTTCCTCGGCAGCATCGACGCCAACCGCGGCGACGCCCAGAACGGCTGGGATACCGACCAGTTCCCGGTGGATCCGTACGATCTCACCCAGGCGATGATCCAGATCATCCGCAACGGCGGCTTCAAGGACGGCGGCACCAACTTCGACGCCAGGCTCCGCCGCTCTTCCACCGACCCGGAGGACATCTTCATCGCCCACATCAGCGCGATGGACGCCATGGCCCACGCCCTGCTGAACGCCGCCGCCGTCATCGAGGAGAGCCCGCTCTGCGAGATGGTCGCCAAGCGTTACGCTTCCTTCGACAGCGGCCTCGGCAAAAAGTTCGAGGAAGGCAAGGCCACCCTCGAGGAACTCTACGAGTATGCCAAGGCGAACGGTGAGGTCAAGGCCGAATCCGGCAAGCAGGAGCTCTACGAGACCCTTCTGAACCTCTACGCGAAATAG 5586MI200_Prevotella Amino  110MAKEYFPTIGKIPFEGVESKNPLAFHYYDAERVVMGKPMKDWFKFAMAWWHTLGQASADP 003 AcidFGGQTRSYEWDKGECPYCRARAKADAGFEIMQKLGIGYYCFHDIDLVEDTEDIAEYEARMKDITDYLVEKQKETGIKNLWGTANVFGNKRYMNGAATNPQFDIVARAALQIKNAIDATIKLGGTGYVFWGGREGYYTLLNTQMQREKDHLAKMLTAARDYARANGFKGTFLIEPKPMEPTKHQYDVDTETVIGFLRANGLDKDFKVNIEVNHATLAGHTFEHELTVAVDNGFLGSIDANRGDAQNGWDTDQFPVDPYDLTQAMIQIIRNGGFKDGGTNFDARLRRSSTDPEDIFIAHISAMDAMAHALLNAAAVIEESPLCEMVAKRYASFDSGLGKKFEEGKATLEELYEYAKANGEVKAESGKQELYETLLNLYAK 5586MI203_ Prevotella DNA 111ATGGCACAAGCGTATTTTCCTACCATCGGGAAAATCCCCTTCGAGGGACCCGAAAGCAAG 003AATCCCCTGGCATTCCATTATTATGAGCCCGACCGCCTGGTCCTGGGCAAGAAGATGAAGGACTGGCTGCGTTTCGCCATGGCCTGGTGGCACACGCTGGGCCAGGCTTCCGGCGACCAGTTCGGCGGCCAGACCCGCCACTACGCCTGGGACGAGCCCGCCACGCCCCTGGAACGGGCCAAGGCCAAGGCGGATGCCGGTTTCGAGATCATGCAGAAACTGGGCATCGAATTCTTCTGCTTCCACGATGTGGACCTCATCGAAGAGGGCGCCACGATCGAGGAATACGAGCAGCGGATGCAGCAGATCACGGATTATCTGCTGGTCAAGATGAAAGAGACCGGCATCCGCAACCTCTGGGGTACGGCCAACGTGTTCGGACACGAGCGCTACATGAACGGCGCGGCCACGAACCCCGATTTCGATGTCGTGGCCCGCGCGGCCGTGCAGATCAAGACGGCCATCGACGCCACCATCAAGTTGGGCGGCGAGAACTATGTGTTCTGGGGCGGCCGGGAAGGCTATATGAGCCTGCTCAATACGCAGATGCACCGCGAGAAGCTGCATCTGGGCAAGATGCTCGCCGCGGCCCGCGACTACGGACGCGCCCACGGCTTCAAGGGGACCTTCCTCATCGAACCCAAGCCGATGGAACCCACCAAGCATCAGTATGACCAGGATACGGAGACGGTCATCGGTTTCCTGCGCCGCTACGGCCTGGACGAAGACTTCAAGGTGAACATCGAGGTCAACCACGCTACGCTGGCCGGCCATACCTTCGAACACGAACTGGCCACGGCGGTCGATGCCGGCCTGCTGGGCAGCATCGACGCCAACCGCGGCGACGCCCAGAACGGCTGGGATACCGACCAGTTCCCGATCGACAACTACGAACTGACCCTGGCGATGCTGCAGGTCATCCGCAACGGCGGTCTGGCCCCGGGCGGCTCGAATTTCGATGCCAAGCTCCGCCGGAACTCCACCGATCCGGAAGACATCTTCATTGCCCACATCAGCGCGATGGATGCGATGGCGCGGGCCCTGCTCAATGCGGCCGCCCTCTGCGAGACGTCCCCGATTCCGGCGATGGTCAAGGCGCGTTACGCTTCGTTCGACAGCGGCGCCGGCAAGGATTTCGAAGAGGGAAGGATGACGCTGGAAGACCTCGTGGCCTATGCCAGGACCCACGGCGAGCCGAAGCGGACCTCGGGCAAGCAGGAACTCTATGAGACCCTCGTGGCGCTTTATTGCAAATAG 5586MI203_Prevotella Amino  112MAQAYFPTIGKIPFEGPESKNPLAFHYYEPDRLVLGKKMKDWLRFAMAWWHTLGQASGDQ 003 AcidFGGQTRHYAWDEPATPLERAKAKADAGFEIMQKLGIEFFCFHDVDLIEEGATIEEYEQRMQQITDYLLVKMKETGIRNLWGTANVFGHERYMNGAATNPDFDVVARAAVQIKTAIDATIKLGGENYVFWGGREGYMSLLNTQMHREKLHLGKMLAAARDYGRAHGFKGTFLIEPKPMEPTKHQYDQDTETVIGFLRRYGLDEDFKVNIEVNHATLAGHTFEHELATAVDAGLLGSIDANRGDAQNGWDTDQFPIDNYELTLAMLQVIRNGGLAPGGSNFDAKLRRNSTDPEDIFIAHISAMDAMARALLNAAALCETSPIPAMVKARYASFDSGAGKDFEEGRMTLEDLVAYARTHGEPKRTSGKQELYETLVALYCK 5586MI205_ Prevotella DNA 113ATGACCAACGAGTATTTTCCCGGAATCGGTGTGATTCCGTTTGAAGGACAGGAAAGCAAG 004AATCCCCTGGCTTTCCATTATTATGACGCCAACCGCGTAGTGATGGGCAAACCCATGAAGGAATGGTTCAAATTTGCCATGGCCTGGTGGCATACGCTGGGGCAGGCATCGGCCGATCCCTTCGGCGGACAGACCCGCTCCTACGCATGGGACAAGGGCGAGTGCCCTTACTGCCGTGCCCGCCAGAAGGCCGACGCCGGCTTTGAACTGATGCAGAAGCTGGGAATCGGCTATTTCTGCTTCCACGATGTGAATATCATCGAGGACTGCGAGGACATTGCCGAGTATGAGGCCCGTATGAAGGACATCACGGACTATCTGCTGGTGAAGATGAAGGAAACGGGCATCAAGAATCTGTGGGGCACGGCCAACGTCTTCGGCCACAAGCGCTATATGAACGGCGCCGCCACCAACCCGCAATTCGACGTGGTAGCCCGCGCTGCGGTCCAGATCAAGAACGCCCTGGACGCCACCATCAAGCTGGGCGGCAGCAATTATGTGTTCTGGGGCGGCCGGGAAGGCTACTACACCCTTTTGAACACGCAGATGCAGCGGGAGAAGGACCACCTGGCCCAGATGCTCAAGGCGGCCCGCGACTATGCCCGCGGCAAGGGATTCAAGGGCACGTTCCTCATTGAGCCCAAGCCCATGGAGCCCACCAAGCACCAGTACGACGTAGATACGGAGACCGTGATTGGTTTCCTGCGCGCCAACGGGCTGGACAAGGACTTCAAGGTGAATATCGAAGTGAACCACGCCACCCTGGCCGGCCATACCTTCGAGCACGAGCTCACCGTGGCCCGCGAAAACGGCTTCCTGGGCAGCATCGACGCCAACCGCGGAGACGCCCAGAACGGCTGGGATACAGACCAGTTCCCCGTGGACGCCTTTGACCTCACCCAGGCCATGATGCAGGTCCTGCTCAACGGCGGATTCGGCAACGGCGGCACCAACTTCGACGCCAAACTGCGCCGTTCCTCCACGGATCCCGAGGACATCTTCATCGCCCACATCAGCGCCATGGACGCCATGGCCCACGCCCTCCTGAACGCCGCCGCCATCCTGGAAGAGAGCCCCATGCCGGGCATGGTGAAGGAGCGCTACGCTTCCTTCGACAATGGCCTTGGCAAGAAGTTCGAGGAAGGAAAGGCCACGCTGGAAGAGCTGTACGACTATGCCAAGAAGAACGGCGAGCCTGTGGCCGCTTCCGGAAAGCAGGAACTGTACGAAACGCTGCTGAACCTGTACGCCAAGTAA 5586MI205_Prevotella Amino  114MTNEYFPGIGVIPFEGQESKNPLAFHYYDANRVVMGKPMKEWFKFAMAWWHTLGQASADP 004 AcidEGGQTRSYAWDKGECPYCRARQKADAGFELMQKLGIGYFCFHDVNIIEDCEDIAEYEARMKDITDYLLVKMKETGIKNLWGTANVFGHKRYMNGAATNPQFDVVARAAVQIKNALDATIKLGGSNYVFWGGREGYYTLLNTQMQREKDHLAQMLKAARDYARGKGFKGTFLIEPKPMEPTKHQYDVDTETVIGFLRANGLDKDFKVNIEVNHATLAGHTFEHELTVARENGFLGSIDANRGDAQNGWDTDQFPVDAFDLTQAMMQVLLNGGFGNGGTNFDAKLRRSSTDPEDIFIAHISAMDAMAHALLNAAAILEESPMPGMVKERYASFDNGLGKKFEEGKATLEELYDYAKKNGEPVAASGKQELYETLLNLYAK 5586MI206_ Prevotella DNA 115ATGGCAAAAGAGTATTTCCCGACTATCGGCAAGATTCCCTTCGAGGGCGTCGAATCCAAG 004AACCCGATGGCATTCCACTATTATGACGCGAAACGCGTCGTGATGGGCAAGCCCATGAAGGACTGGCTCAAGTTCGCGATGGCCTGGTGGCACACCCTGGGACAGGCTTCCGGCGACCCGTTCGGCGGCCAGACCCGTTCCTACGAGTGGGACAAGGGCGAGTGCCCCTACTGCCGCGCCAAGGCCAAGGCCGACGCCGGTTTCGAGATCATGCAGAAACTGGGCATCGAGTACTACTGCTTCCATGACATCGACCTGGTGGAGGACACCGAGGACATCGCCGAGTACGAGGCCCGCATGAAGGACATCACCGACTACCTCGTCGAGAAGCAGAAGGAGACCGGTATCAAGAACCTCTGGGGCACGGCCAACGTGTTCGGCAACAAGCGCTACATGAACGGCGCCGCCACGAACCCGCAGTTCGACGTCGTCGCCCGCGCCGCCGTCCAGATCAAGAACGCCATCGACGCCACCATCAAACTCGGCGGCACCTCTTACGTGTTCTGGGGCGGCCGTGAAGGCTACTACACCCTCCTGAACACCCAGATGCAGCGCGAGAAGGACCACCTCGCCAAGATGCTCACCGCCGCCCGCGACTACGCCCGCGCCCACGGCTTCAAGGGCACCTTCCTCATCGAGCCCAAGCCCATGGAGCCCACCAAGCACCAGTACGACGTGGACACGGAGACCGTGATCGGCTTCCTCCGCGCCAACGGCCTGGACAAGGACTTCAAGGTCAATATCGAAGTGAACCACGCCACCCTCGCCGGCCACACCTICGAGCATGAGCTCACCGTGGCGGTCGATAACGGCTTCCTCGGCTCCATCGACGCCAACCGTGGCGACGCCCAGAACGGCTGGGATACCGACCAGTTCCCGGTGGATCCGTACGACCTCACCCAGGCCATGATGCAGATCATCCGCAACGGCGGCTTCAAGGACGGCGGCACCAACTTCGACGCCAAACTCCGCCGCTCCTCCACCGACCCGGAGGACATCTTCATCGCCCACATCAGCGCGATGGACGCCATGGCCCACGCGCTCCTGAACGCCGCCGCCGTCATCGAGGAGAGCCCGCTCTGCAAGATGGTCGAGGAGCGCTACGCTTCCTTCGACAGCGGTCTCGGCAAGCAGTTCGAGGAAGGCAAGGCCACCCTTGAGGACCTCTACGAGTATGCCAAGAAGAACGGCGAGCCCGTCGTCGCTTCCGGCAAGCAGGAGCTCTACGAGACCCTTCTGAACCTCTACGCGAAGTAG 5586MI206_Prevotella Amino  116MAKEYFPTIGKIPFEGVESKNPMAFHYYDAKRVVMGKPMKDWLKFAMAWWHTLGQASGDP 004 AcidFGGQTRSYEWDKGECPYCRAKAKADAGFEIMQKLGIEYYCFHDIDLVEDTEDIAEYEARMKDITDYLVEKQKETGIKNLWGTANVFGNKRYMNGAATNPQFDVVARAAVQIKNAIDATIKLGGTSYVFWGGREGYYTLLNTQMQREKDHLAKMLTAARDYARAHGFKGTFLTEPKPMEPTKHQYDVDTETVIGFLRANGLDKDFKVNIEVNHATLAGHTFEHELTVAVDNGFLGSIDANRGDAQNGWDTDQFPVDPYDLTQAMMQIIRNGGFKDGGTNFDAKLRRSSTDPEDIFIAHISAMDAMAHALLNAAAVIEESPLCKMVEERYASFDSGLGKQFEEGKATLEDLYEYAKKNGEPVVASGKQELYETLLNLYAK 5586MI208_ Prevotella DNA 117ATGTCAACTGAGTATTTCCCTACAATCGGCAAGATTCCCTTCGAGGGACCCGAGAGCAAG 003AACCCCATGGCCTTCCACTACTATGAACCCGAAAAGTTGGTGATGGGCAAGAAGATGAAGGACTGGCTGCGITTCGCAATGGCCTGGTGGCACACCCTIGGAGCCGCATCCGGCGACCAGTTCGGCGGACAGACCCGCAGTTACGCCTGGGACAAGGGCGACTGCCCTTACAGCCGCGCCCGCGCCAAGGTCGACGCCGGCTTCGAGATCATGCAGAAGCTCGGCATAGAGTTCTTCTGCTTCCATGACATCGACCTGGTCGAGGATACCGACGACATCGCCGAGTATGAAGCCCGGATGAAAGACATCACGGACTATCTGCTGGAAAAGATGGAGGCTACCGGCATCAAGAACCTCTGGGGCACGGCCAATGTCTTCGGTCACAAGCGTTATATGAACGGTGCAGCCACAAACCCCGATTTCGCAGTGGTCGCAAGGGCGGCCGTGCAGATCAAGAACGCCATCGACGCCACCATCAAGCTGGGTGGTGAGAACTATGTGTTCTGGGGTGGACGCGAGGGTTATATGAGCCTGCTCAACACCCAGATGCAGAGGGAGAAGGAACACCTTGCCAAGATGCTCACCGCCGCACGTGACTATGCACGCGCCAAAGGTTTCAAGGGCACGTTCCTCATCGAACCCAAGCCGATGGAACCCACCAAGCACCAGTATGACCAGGATACCGAGACCGTTATCGGATTCCTCCGCAGCCACGGCCTGGACAAGGACTTCAAGGTCAACATCGAGGTGAACCACGCCACCCTGGCGGGCCATACCTTCGAGCACGAACTGGCCACCGCCGTCGACAACGGCATGCTCGGCAGCATCGACGCCAACCGCGGAGACGCCCAGAACGGCTGGGACACCGACCAGTTCCCGATCGACAACTTCGAGCTCACGCTTGCCATGATGCAGATAATCCGCAACGGCGGCCTGGCACCGGGCGGTTCGAACTTCGACGCAAAGCTGCGCCGCAATTCCACCGATCCCGAGGACATCTTCATCGCCCACATCAGCGCGATGGACGCCATGGCCCGCGCCCTCGTCAACGCCGCCGCCATCCTCGGCGAGTCGCCCGTTCCGGCTATGGTCAAGGACCGCTATGCTTCGTTCGACTGCGGCAAGGGCAAGGACTTCGAAGACGGCAAACTGACTCTCGAAGACATCGTCGCCTACGCCAGGGAGAATGGCGAGCCGAAACAGATTTCCGGCAAGCAGGAACTCTACGAAACTATCGTCGCTCTTTACTGCAAGTAA 5586MI208_Prevotella Amino  118MSTEYFPTIGKIPFEGPESKNPMAFHYYEPEKLVMGKKMKDWLRFAMAWWHTLGAASGDQ 003 AcidFGGQTRSYAWDKGDCPYSRARAKVDAGFEIMQKLGIEFFCFHDIDLVEDTDDIAEYEARMKDITDYLLEKMEATGIKNLWGTANVFGHKRYMNGAATNPDFAVVARAAVQIKNAIDATIKLGGENYVFWGGREGYMSLLNTQMQREKEHLAKMLTAARDYARAKGFKGTFLIEPKPMEPTKHQYDQDTETVIGFLRSHGLDKDFKVNIEVNHATLAGHTFEHELATAVDNGMLGSIDANRGDAQNGWDTDQFPIDNFELTLAMMQIIRNGGLAPGGSNFDAKLRRNSTDPEDIFIAHISAMDAMARALVNAAAILGESPVPAMVKDRYASFDCGKGKDFEDGKLTLEDIVAYARENGEPKQISGKQELYETIVALYCK 5586MI210_ Prevotella DNA 119ATGTCATATTTTCCTACTATCGGTAACATCCCCTTTGAGGGTGTAGAGAGCAAGAATCCC 002CTTGCCTTCCATTATTATGACGCTTCCCGCGTAGTTATGGGCAAGCCCATGAAGGAGTGGCTCAAGTTTGCCATGGCCTGGTGGCACACGCTGGGTCAGGCATCGGCCGACCCTTTCGGCGGACAAACCCGCAGCTATGCCTGGGACAAAGGCGAGTGCCCCTACTGCCGTGCCCGTGCCAAGGCCGACGCCGGCTTCGAGCTCATGCAGAAACTGGGCATCGAGTATTTCTGCTCCCACGACATTGACCTCATCGAGGACTGCGACGACATTGCAGAGTACGAGGCCCGTCTGAAGGACATTACGGACTACCTCCTGGAGAAGATGAAGAAGACCGGTATCAAGAACCTGTGGGGTACGGCCAATGTGTTCGGTAACAAGCGTTACATGAACGGTGCTGCTACCAACCCTCAGTTTGACGTTGTGGCCCGCGCTGCCGTCCAGATCAAGAACGCCATTGACGCTACCATCAAGCTGGGCGGTTCCAACTATGTGTTCTGGGGTGGCCGTGAGGGTTACTACACGCTTCTGAACACCCAGATGCAGCGTGAGAAGAATCACCTGGCTGCCATGCTCAAGGCTGCCCGCGACTATGCCCGCGCCAACGGTTTCAAGGGCACCTTCCTCATTGAGCCCAAGCCCATGGAGCCCACCAAGCACCAGTACGACGTAGACACGGAGACCGTGATTGGATTCCTCCGCGCCAACGGTCTGGAGAAGGACTTCAAGGTGAACATTGAGGTGAACCACGCTACTCTTGCCGGTCACACCTTCGAGCACGAGCTCACCGTGGCCCGTGAGAACGGCTTCCTGGGTTCCATTGACGCCAACCGCGGAGATGCCCAGAACGGCTGGGACACCGACCAGTTCCCGGTAGATGCCTTTGACCTCACCCAGGCCATGATGCAGATTCTCCTCAACGGAGGCTCCGGCAATGGCGGTACCAACTTTGACGCCAAGCTGCGCCGTTCCTCCACCGACCCCGAGGACATCTTCATCGCGCACATCAGCGCCATGGATGCCATGGCTCACGCCCTGCTCAATGCAGCTGCCGTGCTGGAGGAGAGCCCGCTTTGCAAGATGGTCAAGGAGCGTTACGCTTCCTTCGACAGCGGTCTTGGCAAGCAGTTCGAGGAAGGAAAGGCTACGCTGGAAGATCTGTATGCCTATGCCGTCAAGAACGGTGAGCCCGTGGTGGCTTCCGGCAAGCAGGAACTGTACGAAACCTTCCTGAACCTCTATGCAAAATGGTAA 5586MI210_Prevotella Amino  120MSYFPTIGNIPFEGVESKNPLAFHYYDASRVVMGKPMKEWLKFAMAWWHTLGQASADPFG 002 AcidGQTRSYAWDKGECPYCRARAKADAGFELMQKLGIEYFCSHDIDLIEDCDDIAEYEARLKDITDYLLEKMKKTGIKNLWGTANVFGNKRYMNGAATNPQFDVVARAAVQIKNAIDATIKLGGSNYVFWGGREGYYTLLNTQMQREKNHLAAMLKAARDYARANGFKGTFLIEPKPMEPTKHQYDVDTETVIGFLRANGLEKDFKVNIEVNHATLAGHTFEHELTVARENGFLGSIDANRGDAQNGWDTDQFPVDAFDLTQAMMQILLNGGSGNGGTNFDAKLRRSSTDPEDIFIAHISAMDAMAHALLNAAAVLEESPLCKMVKERYASFDSGLGKQFEEGKATLEDLYAYAVKNGEPVVASGKQELYETFLNLYAKW 5586MI212_ Prevotella DNA 121ATGTCAACTGAGTATTTCCCTACAATCGGCAAGATTCCCTTCGAGGGACCCGAGAGCAAG 002AACCCCATGGCCTTCCACTACTATGAACCCGAAAAGTTGGTGATGGGCAAGAAGATGAAGGACTGGCTGCGTTTCGCAATGGCCTGGTGGCACACCCTTGGAGCCGCATCCGGCGACCAGTTCGGCGGACAGACCCGCAGTTACGCCTGGGACAAGGGCGACTGCCCTTACAGCCGCGCCCGCGCCAAGGTCGACGCCGGCTTCGAGATCATGCAGAAGCTCGGCATAGAGTTCTTCTGCTTCCATGACATCGACCTGGTCGAGGATACCGACGACATCGCCGAGTATGAAGCCCGGATGAAAGACATCACGGACTATCTGCTGGAAAAGATGGAGGTTACCGGCATCAAGAACCTCTGGGGCACGGCCAATGTCTTCGGTCACAAGCGTTATATGAACGATGCAGCCACAAACCCCGATTTCGCAGTGGTCGCAAGGGCGGCCGTGCAGATCAAGAACGCCATCGACGCCACCATCAAGCTGGGTGGTGAGAACTATGTGTTCTGGGGTGGACGCGAGGGTTATATGAGCCTGCTCAACACCCAGATGCAGAGGGAGAAGGAACACCTTGCCAAGATGCTCACCGCCGCACGTGACTATGCACGCGCCAAAGGTTTCAAGGGCACGTTCCTCATCGAACCCGAGCCGATGGAACCCACCAAGCACCAGTATGACCAGGATACCGAGACCGTTATCGGATTCCTCCGCAGCCACGGCCTGGACAAGGACTTCAAGGTCAACATCGAGGTGAACCACGCCACCCTGGCGGGCCATACCTTCGAGCACGAACTGGCCACCGCCGTCGACAACGGCATGCTCGGCAGCATCGACGCCAACCGCGGAGACGCCCAGAACGGCTGGGACACCGACCAGTTCCCGATCGACAACTTCGAGCTCACGCTTGCCATGATGCAGATAATCCGCAACGGCGGCCTGGCACCGGGCGGTTCGAACTTCGACGCAAAGCTGCGCCGCAATTCCACCGATCCCGAGGACATCATCATCGCCCACATCAGCGCGATGGACGCCATGGCCCGCGCCCTCGTCAACGCCGCCGCCATCCTCGGCGAGTCGCCCGTTCCGGCTATGGTCAAGGACCGCTATGCTTCGTTCGACTGCGGCAAGGGCAAGGACTTCGAAGACGGCAAACTGACTCTCGAAGACATCGTCGCCTACGCCAGGGAGAATGGCGAGCCGAAACAGATTTCCGGCAAGCAGGAACTCTACGAAACTATCGTCGCTCTTTACTGCAAGTAA 5586MI212_Prevotella Amino  122MSTEYFPTIGKIPFEGPESKNPMAFHYYEPEKLVMGKKMKDWLRFAMAWWHTLGAASGDQ 002 AcidFGGQTRSYAWDKGDCPYSRARAKVDAGFEIMQKLGIEFFCFHDIDLVEDTDDIAEYEARMKDITDYLLEKMEVTGIKNLWGTANVFGHKRYMNDAATNPDFAVVARAAVQIKNAIDATIKLGGENYVFWGGREGYMSLLNTQMQREKEHLAKMLTAARDYARAKGFKGTFLIEPEPMEPTKHQYDQDTETVIGFLRSHGLDKDFKVNIEVNHATLAGHTFEHELATAVDNGMLGSIDANRGDAQNGWDTDQFPIDNFELTLAMMQIIRNGGLAPGGSNFDAKLRRNSTDPEDIIIAHISAMDAMARALVNAAAILGESPVPAMVKDRYASFDCGKGKDFEDGKLTLEDIVAYARENGEPKQISGKQELYETIVALYCK 5586MI213_ Prevotella DNA 123ATGACCAACGAGTATTTTCCCGGAATCGGTGTGATTCCGTTTGAAGGACAGGAAAGCAAG 003AATCCCCTGGCTTTCCATTATTATGACGCCAACCGCGTAGTGATGGGCAAACCCATGAAGGAATGGTTCAAATTTGCCATGGCCTGGTGGCATACGCTGGGGCAGGCATCGGCCGATCCCTTCGGCGGACAGACCCGCTCCTACGCATGGGACAAGGGCGAGTGCCCTTACTGCCGTGCCCGCCAGAAGGCCGACGCCGGCTTTGAACTGATGCAGAAGCTGGGAATCGGCTATTTCTGCTTCCACGATGTGGATATCATCGAGGACTGCGAGGACATTGCCGAGTATGAGGCCCGTATGAAGGACATCACGGACTATCTGCTGGTGAAGATGAAGGAAACGGGCATCAAGAATCTGTGGGGCACGGCCAACGTCTTCGGCCACAAGCGCTATATGAACGGCGCCGCCACCAACCCGCAATTCGACGTGGTAGCCCGCGCTGCGGTCCAGATCAAGAACGCCCTGGACGCCACCATCAAGCTGGGCGGCAGCAATTATGTGTTCTGGGGCGGCCGGGAAGGCTACTACACCCTTTTGAACACGCAGATGCAGCGGGAGAAGGACCACCTGGCCCAGATGCTCAAGGCGGCCCGCGACTATGCCCGCGGCAAGGGATTCAAGGGCACGTTCCTCATTGAGCCCAAGCCCATGGAGCCCACCAAGCACCAGTACGACGTAGATACGGAGACCGTGATTGGTTTCCTGCGCGCCAACGGGCTGGACAAGGACTTCAAGGTGAATATCGAAGTGAACCACGCCACCCTGGCCGGCCATACCTTCGAGCACGAGCTCACCGTGGCCCGCGAAAACGGCTTCCTGGGCAGCATCGACGCCAACCGCGGAGACGCCCAGAACGGCTGGGATACAGACCAGTTCCCCGTGGACGCCTTTGACCTCACCCAGGCCATGATGCAGGTCCTGCTCAACGGCGGATTCGGCAACGGCGGCACCAACTTCGACGCCAAACTGCGCCGTTCCTCCACGGATCCCGAGGACATCTTCATCGCCCACATCAGCGCCATGGACGCCATGGCCCACGCCCTCCTGAACGCCGCCGCCATCCTGGAAGAGAGCCCCATGCCGGGCATGGTGAAGGAGCGCTACGCTTCCTTCGACAATGGCCTTGGCAAGAAGTTCGAGGAAGGAAAGGCCACGCTGGAAGAGCTGTACGACTATGCCAAGAAGAACGGCGAGCCTGTGGCCGCTTCCGGAAAGCAGGAACTGTACGAAACGCTGCTGAACCTGTACGCCAAGTAA 5586MI213_Prevotella Amino  124MTNEYFPGIGVIPFEGQESKNPLAFHYYDANRVVMGKPMKEWFKFAMAWWHTLGQASADP 003 AcidFGGQTRSYAWDKGECPYCRARQKADAGFELMQKLGIGYFCFHDVDIIEDCEDIAEYEARMKDITDYLLVKMKETGIKNLWGTANVFGHKRYMNGAATNPQFDVVARAAVQIKNALDATIKLGGSNYVFWGGREGYYTLLNTQMQREKDHLAQMLKAARDYARGKGFKGTFLIEPKPMEPTKHQYDVDTETVIGFLRANGLDKDFKVNIEVNHATLAGHTFEHELTVARENGFLGSIDANRGDAQNGWDTDQFPVDAFDLTQAMMQVLLNGGEGNGGTNEDAKLRRSSTDPEDIFIAHISAMDAMAHALLNAAAILEESPMPGMVKERYASFDNGLGKKFEEGKATLEELYDYAKKNGEPVAASGKQELYETLLNLYAK 5586MI215_ Prevotella DNA 125ATGGCAAAAGAGTATTTCCCGCAGATCGGAAAGATCGGCTTTGAGGGTCTTGAGAGCAAG 003AACCCGATGGCATTCCATTATTATGACGCCGAGCGTGTCGTGCTCGGAAAGAAGATGAAGGACTGGCTGAAGTTCGCGATGGCCTGGTGGCATACGCTCGGACAGGCTTCCGGCGACCCATTCGGCGGCCAGACTCGCAGCTATGAGTGGGACAAGGGCGAGTGCCCCTACTGCCGTGCCCGCGCCAAGGCCGACGCCGGCTTCGAGCTCATGCAGAAGCTCGGCATCGAGTACTTCTGCTTCCACGACATCGACCTCATCGAGGACTGCGACGACATCGACGAGTACGAGGCCCGGATGAAGGACATCACCGACTACCTGCTGGAGAAGATGAAGGAGACCGGAATCAAGAATCTCTGGGGAACGGCCAACGTCTTCGGTCACAAGCGCTACATGAACGGCGCCGCTACCAATCCGCAGTTTGAAATCGTCGCCCGCGCTGCCGTCCAGATCAAGAACGCGCTCGACGCCACCATCAAGCTCGGCGGCTCCAACTACGTCTTCTGGGGCGGCCGCGAGGGCTATTACACGCTGCTGAATACCCAGATGCAGCGCGAGAAGGACCATCTCGCCAGGCTCCTTACCGCCGCCCGCGACTATGCGCGCGCCAAGGGGTTCAAGGGGACCTTCCCCATCGAGCCGAAGCCGATGGAGCCGACCAAGCACCAGTATGACGTCGACACGGAGACCGTCATCGGTTTCCTCCGCCAGAATGGCCTCGACAAGGACTTCAAGGTCAATATCGAGGTGAACCACGCCACCCTCGCCGGCCATACCTTCGAGCACGAGCTGACCGCGGCCCGGGAGAACGGCTTCCTCGGCAGCATCGACGCCAACCGCGGCGACGCCCAGAACGGCTGGGACACCGACCAGTTCCCGGTGGACGCCTTCGATCTCACGCGGGCCATGATGCAGATCCTGCTCAATGGCGGTTTCGGCAACGGCGGCACCAACTTCGACGCCAAGCTGCGCCGCAGCTCCACCGATCCCGAGGACATCTTCATCGCCCACATCAGCGCGATGGACGCCATGGCCCACGCCCTGCTGAATGCGGCCGCCATCCTCGAGGAAAGCCCGCTGCCGGCCCTGGTCAAGCAGCGCTATGCGTCCTTCGACAGCGGTCTCGGCAAGCAGTTCGAGGAGGGTAAGGCCACGCTCGAGGACCTGTACGCATACGCGAAGGAGCACGGCGAGCCCGTCGCGGCCTCCGGCAAGCAGGAGCTCTGCGAGACCTATCTCAACCTCTACGCGAAATAA 5586MI215_Prevotella Amino  126MAKEYFPQIGKIGFEGLESKNPMAFHYYDAERVVLGKKMKDWLKFAMAWWHTLGQASGDP 003 AcidFGGQTRSYEWDKGECPYCRARAKADAGFELMQKLGIEYFCFHDIDLIEDCDDIDEYEARMKDITDYLLEKMKETGIKNLWGTANVEGHKRYMNGAATNPQFEIVARAAVQIKNALDATIKLGGSNYVFWGGREGYYTLLNTQMQREKDHLARLLTAARDYARAKGFKGTFPIEPKPMEPTKHQYDVDTETVIGFLRQNGLDKDFKVNIEVNHATLAGHTFEHELTAARENGFLGSIDANRGDAQNGWDTDQFPVDAFDLTRAMMQILLNGGFGNGGTNFDAKLRRSSTDPEDIFIAHISAMDAMAHALLNAAAILEESPLPALVKQRYASFDSGLGKQFEEGKATLEDLYAYAKEHGEPVAASGKQELCETYLNLYAK 5607MI1_ Prevotella DNA 127ATGAGTAAAGAGTATTTTCCTGGGATTGGCAAAATCCCGTATGAGGGAGCCGAGAGCAAG 003AATGTGATGGCATTCCACTATTATGATCCCGAACGCGTGGTCATGGGCAAGAAAATGAAAGACTGGTTCAAGTTCGCTATTGCCTGGTGGCATACCCTGGGGCAGGCCAGTGCTGACCAGTTTGGCGGACAGACCCGTTTCTATGAATGGGACAAAGCCGAGGACCCCTTGCAGCGTGCCAAGGACAAGATGGATGCCGGTTTTGAAATCATGCAGAAGCTGGGCATCGAGTATTTCTGTTTCCATGATGTGGACCTCATCGAGGAGGCCGATACCATCGAGGAATATGAAGCCCGCATGCAGGCGATTACCGACTACGCGCTGGAGAAGATGAAGGCAACGGGTATCAAGTTGCTGTGGGGCACTGCCAACGTGTTCGGCCACAAGCGTTACATGAACGGCGCCGCCACCAATCCCGACTTCAATGTCGTGGCACGTGCAGCCGTGCAGATCAAGAACGCCCTCGATGCTACCATCAAGTTGGGCGGAACGAGCTACGTCTTCTGGGGCGGTCGTGAAGGCTATCAGAGCCTGCTCAACACCCAGATGCAGCGCGAGAAGAACCACCTGGCCAAGATGCTCACGGCAGCCCGTGACTATGCCCGTGCTAAGGGCTTCAAGGGCACCTTCCTGATTGAGCCCAAGCCGATGGAACCCACCAAGCACCAGTATGACCAGGACACCGAGACCGTTATCGGCTTCTTGCGTGCCAATGGCCTTGACAAGGACTTTAAGGTCAACATTGAGGTCAACCATGCCACGCTGGCTGGCCACACCTTTGCACATGAGTTGGCAGTGGCTGTGGATAACGGTATGCTGGGCAGCATCGATGCTAACCGTGGTGACCACCAGAACGGCTGGGATACAGACCAGTTCCCCATCAACAGTTATGAACTCACCAATGCTATGCTGCAGATCATGCACGGCGGCGGTTTCAAGGACGGCGGTACCAACTTTGACGCCAAGCTGCGCCGCAACAGTACCGACCCCGAGGACATCTTTACCGCTCACATCAGTGGTATGGACGCTCTGGCCCGTGCCCTGTTGAGTGCTGCCGATATCCTTGAGAAGAGCGAGTTGCCTGAAATGCTCAAGGAACGCTATGCCAGCTTTGACGCGGGTGAAGGCAAGCGCTTTGAGGATGGCCAGATGACTCTTGAGGAACTGGTTGCCTATGCCAAGTCCCATGGCGAGCCTGCTACCATCAGTGGCAAGCAGGAAAAATATGAAGCCATCGTGGCTTTGCACGTCAAGTAA 5607MI1_Prevotella Amino  128MSKEYFPGIGKIPYEGAESKNVMAFHYYDPERVVMGKKMKDWFKFAIAWWHTLGQASADQ 003 AcidFGGQTREYEWDKAEDPLQRAKDKMDAGFEIMQKLGIEYFCFHDVDLIEEADTIEEYEARMQAITDYALEKMKATGIKLLWGTANVFGHKRYMNGAATNPDFNVVARAAVQIKNALDATIKLGGTSYVFWGGREGYQSLLNTQMQREKNHLAKMLTAARDYARAKGFKGTFLIEPKPMEPTKHQYDQDTETVIGFLRANGLDKDFKVNIEVNHATLAGHTFAHELAVAVDNGMLGSIDANRGDHQNGWDTDQFPINSYELTNAMLQIMHGGGFKDGGTNFDAKLRRNSTDPEDIFTAHISGMDALARALLSAADILEKSELPEMLKERYASFDAGEGKRFEDGQMTLEELVAYAKSHGEPATISGKQEKYEAIVALHVK 5607MI2_ Prevotella DNA 129ATGAGTAAAGAGTATTATCCTGAGATTGGCAAAATCCCGTTTGAGGGTCCCGAGAGCAAG 003AATGTGATGGCGTTCCATTACTATGAACCCGAACGCGTCGTCATGGGTAAGAAGATGAAAGACTGGCTCAAGTTTGCCATGTGCTGGTGGCACAGCCTGGGTCAGGCCAGTGCCGACCAGTTCGGCGGACAGACACGTTTCTACGAGTGGGACAAGGCCGATACCCCCCTGCAGCGTGCCAAGGACAAAATGGATGCCGGATTTGAAATCATGCAGAAGTTGGGCATCGAGTACTTCTGCTTCCACGATGTGGACCTCATCGAGGAGGCCGATACCATCGAGGAATACGAGGCCCGCATGAAGGCCATTACCGACTATGCGCTGGAGAAGATGCAGGCCACCGGCATCAAGTTGCTGTGGGGCACTGCCAATGTGTTCGGCCACAAGCGCTACATGAACGGCGCCGCCACCAATCCCGATTTCAATGTCGTGGCACGTGCCGCCGTCCAAATCAAGAATGCCATCGATGCCACCATCAAGCTGGGCGGCACGAGTTACGTCTTCTGGGGTGGTCGTGAGGGCTATCAGAGTCTGCTCAACACGCAGATGCAGCGCGAGAAGGACCATCTGGCCCGCATGCTGGCGGCAGCCCGCGACTATGGCCGTGCCCATGGCTTCAAGGGCACTTTCCTGATCGAGCCCAAACCCATGGAGCCCACCAAGCACCAGTATGATGTGGACACCGAGACCGTGCTCGGCTTCCTGCGTGCCCACGGCCTGGACAAGGACTTCAAGGTTAACATCGAGGICAATCATGCTACGCTGGCGGGACACACTTICAGCCACGAACTGGCTGTGGCCGTGGACAACGGTATGCTGGGCAGCATCGACGCCAACCGCGGCGATTATCAGAATGGCTGGGACACCGACCAGTTCCCCATCGACAGCTTCGAGCTCACCCAGGCCATGCTGCAGATCATGCGCGGCGGCGGCTTCAAGGACGGAGGTACCAACTTCGATGCCAAGCTGCGTCGCAACAGTACCGACCCTGAGGACATCTTCATCGCCCACATCAGCGGTATGGATGCCATGGCACGCGGCCTGTTGAGCGCTGCCGCTATCCTCGAGGATGGCGAGTTGCCCGCGATGCTCAAGGCACGTTATGCCAGCTTTGACCAGGGCGAGGGTAAGCGCTTTGAGGACGGCGAGATGACGCTCGAGCAGCTGGTGGATTATGCAAAGGATTATGCCAAATCGCACGGCGAGCCTGATGTCATCAGCGGCAAGCAGGAGAAGTTTGAAACCATCGTGGCCCTTTAC GCCAAGTAA5607MI2_ Prevotella Amino  130MSKEYYPEIGKIPFEGPESKNVMAFHYYEPERVVMGKKMKDWLKFAMCWWHSLGQASADQ 003 AcidFGGQTRFYEWDKADTPLQRAKDKMDAGFEIMQKLGIEYFCFHDVDLIEEADTIEEYEARMKAITDYALEKMQATGIKLLWGTANVFGHKRYMNGAATNPDFNVVARAAVQIKNAIDATIKLGGTSYVFWGGREGYQSLLNTQMQREKDHLARMLAAARDYGRAHGFKGTFLIEPKPMEPTKHQYDVDTETVLGFLRAHGLDKDFKVNIEVNHATLAGHTFSHELAVAVDNGMLGSIDANRGDYQNGWDTDQFPIDSFELTQAMLQIMRGGGFKDGGTNFDAKLRRNSTDPEDIFIAHISGMDAMARGLLSAAAILEDGELPAMLKARYASFDQGEGKRFEDGEMTLEQLVDYAKDYAKSHGEPDVISGKQEKFETIVALYAK 5607MI3_ Prevotella DNA 131ATGACCAACGAGTATTTTCCCGGAATCGGTGTGATTCCGTTTGAAGGACAGGAAAGCAAG 003AATCCCCTGGCTTTCCATTATTATGACGCCAACCGCGTAGTGATGGGCAAACCCATGAAGGAATGGTTCAAATTTGCCATGGCCTGGTGGCATACGCTGGGGCAGGCATCGGCCGATCCCTTCGGCGGACAGACCCGCTCCTACGCATGGGACAAGGGCGAGTGCCCTTACTGCCGTGCCCGCCAGAAGGCCGACGCCGGCTTTGAACTGATGCAGAAGCTGGGAATCGGCTATTTCTGCTTCCACGATGTGGATATCATCGAGGACTGCGAGGACATTGCCGAGTATGAGGCCCGTATGAAGGACATCACGGACTATCTGCTGGTGAAGATGAAGGAAACGGGCATCAAGAATCTGTGGGGCACGGCCAACGTCTTCGGCCACAAGCGCTATATGAACGGCGCCGCCACCAACCCGCAATTCGACGTGGTAGCCCGCGCTGCGGTCCAGATCAAGAACGCCCTGGACGCCACCATCAAGCTGGGCGGCAGCAATTATGTGTTCTGGGGCGGCCGGGAAGGCTACTACACCCTTTTGAACACGCAGATGCAGCGGGAGAAGGACCACCTGGCCCAGATGCTCAAGGCGGCCCGCGACTATGCCCGCGGCAAGGGATTCAAGGGCACGTTCCTCATTGAGCCCAAGCCCATGGAGCCCACCAAGCACCAGTACGACGTAGATACGGAGACCGTGATTGGTTTCCTGCGCGCCAACGGGCCGGACAAGGACTTCAAGGTGAATATCGAAGTGAACCACGCCACCCTGGCCGGCCATACCTTCGAGCACGAGCTCACCGTGGCCCGCGAAAACGGCTTCCTGGGCAGCATCGACGCCAACCGCGGAGACGCCCAGAACGGCTGGGATACAGACCAGTTCCCCGTGGACGCCTTTGACCTCACCCAGGCCATGATGCAGGTCCTGCTCAACGGCGGATTCGGCAACGGCGGCACCAACTTCGACGCCAAACTGCGCCGTTCCTCCACGGATCCCGAGGACATCTTCATCGCCCACATCAGCGCCATGGACGCCATGGCCCACGCCCTCCTGAACGCCGCCGCCATCCTGGAAGAGAGCCCCATGCCGGGCATGGTGAAGGAGCGCTACGCTTCCTTCGACAATGGCCTTGGCAAGAAGTTCGAGGAAGGAAAGGCCACGCTGGAAGAGCTGTACGACTATGCCAAGAAGAACGGCGAGCCTGIGGCCGCTTCCGGAAAGCAGGAACTGTACGAAACGCTGCTGAACCTGTACGCCAAGTAA 5607MI3_Prevotella Amino  132MTNEYFPGIGVIPFEGQESKNPLAFHYYDANRVVMGKPMKEWFKFAMAWWHTLGQASADP 003 AcidFGGQTRSYAWDKGECPYCRARQKADAGFELMQKLGIGYFCFHDVDIIEDCEDIAEYEARMKDITDYLLVKMKETGIKNLWGTANVFGHKRYMNGAATNPQFDVVARAAVQIKNALDATIKLGGSNYVFWGGREGYYTLLNTQMQREKDHLAQMLKAARDYARGKGFKGTFLIEPKPMEPTKHQYDVDTETVIGFLRANGPDKDFKVNIEVNHATLAGHTFEHELTVARENGFLGSIDANRGDAQNGWDTDQFPVDAFDLTQAMMQVLLNGGFGNGGTNFDAKLRRSSTDPEDIFIAHISAMDAMAHALLNAAAILEESPMPGMVKERYASFDNGLGKKFEEGKATLEELYDYAKKNGEPVAASGKQELYETLLNLYAK 5607MI4_ Prevotella DNA 133ATGACTAAAGAGTATTTCCCTTCCGTCGGCAAGATTGCCTTTGAAGGACCCGAAAGCAAG 005AACCCTATGGCCTTCCATTATTATGACGCCAATCGCGTGGTAATGGGAAAGCCGATGAAAGAATGGCTTAAATTTGCCATGGCCTGGTGGCACACCCTGGGCCAGGCCTCTGCAGACCCCTTCGGCGGTCAGACCCGCTCCTACGAGTGGGACAAGGGCGAGTGCCCCTACTGCCGCGCCAAGGCCAAGGCCGATGCCGGCTTTGAACTGATGCAGAAACTGGGCATCGAGTATTTCTGCTTCCACGATATAGACCTGGTGGAAGACTGCGATGATATCGCCGAATACGAGGCCCGCATGAAGGACATCACGGACTATCTCCTGGAGAAGATGAAGGAAACCGGCATCAAGAACCTCTGGGGAACCGCCAACGTGTTCGGCCACAAGCGCTATATGAACGGCGCCGCCACCAACCCTCAGTTCGACATCGTGGCCCGTGCCGCTGTCCAGATCAAGAACGCCCTGGATGCCACCATCAAGCTGGGCGGCTCCAACTATGTGTTCTGGGGCGGCCGTGAGGGCTACTATACCCTCCTGAACACCCAGATGCAGAGAGAGAAGGACCACCTGGCCAAGATGCTCACCGCCGCCCGCGACTATGCCCGTGCCAAGGGCTTCAAGGGCACCTTCCTCATCGAACCCAAGCCGATGGAGCCCACCAAGCACCAGTACGACGTAGATACGGAGACCGTGATCGGCTTCCTCCGCGCCAACGGCCTGGACAAGGACTTCAAGGTGAATATTGAGGTGAACCACGCCACCCTGGCCGGCCACACCTTCGAGCACGAGCTCACCGTGGCCCGCGAGAACGGCTTCCTGGGCAGCATCGACGCCAACCGCGGAGACGCCCAGAACGGCTGGGATACGGACCAGTTCCCGGTGGATGCCTTCGACCTCACCCAGGCTATGATGCAGATCCTTCTGAACGGAGGCTTCGGCAACGGCGGTACCAACTTCGACGCCAAACTGCGCCGCTCCTCCACGGACCCCGAGGACATCTTCATCGCCCACATCAGCGCTATGGATGCCATGGCCCACGCCCTGCTGAATGCAGCCGCCATCCTGGAGGAAAGCCCGCTTCCGAAGATGCTGAAAGAGCGTTATGCCAGCTTTGACGGCGGTCTGGGCAAGAAGTTCGAAGAAGGCAAGGCCTCTCTGGAAGAACTCTACGAGTATGCCAAGAGCAACGGAGAGCCCGTGGCCGCTTCCGGCAAGCAGGAGCTCTGCGAAACGTACCTGAACCTCTACGCTAAGTAA 5607MI4_Prevotella Amino  134MTKEYFPSVGKIAFEGPESKNPMAFHYYDANRVVMGKPMKEWLKFAMAWWHTLGQASADP 005 AcidFGGQTRSYEWDKGECPYCRAKAKADAGFELMQKLGIEYFCFHDIDLVEDCDDIAEYEARMKDITDYLLEKMKETGIKNLWGTANVFGHKRYMNGAATNPQFDIVARAAVQIKNALDATIKLGGSNYVFWGGREGYYTLLNTQMQREKDHLAKMLTAARDYARAKGFKGTFLIEPKPMEPTKHQYDVDTETVIGFLRANGLDKDFKVNIEVNHATLAGHTFEHELTVARENGFLGSIDANRGDAQNGWDTDQFPVDAFDLTQAMMQILLNGGFGNGGTNFDAKLRRSSTDPEDIFIAHISAMDAMAHALLNAAAILEESPLPKMLKERYASFDGGLGKKFEEGKASLEELYEYAKSNGEPVAASGKQELCETYLNLYAK 5607MI5_ Prevotella DNA 135ATGGCTAAAGAATACTTCCCCTCCATCGGCAAAATCCCTTTTGAAGGAGCCGACAGCAAA 002AATCCCCTCGCTTTCCATTATTATGACGCCGGACGCGTGGTTATGGGCAAGCCCATGAAGGAATGGCTTAAATTCGCCATGGCCTGGTGGCACACGCTGGGCCAGGCCTCCGGAGACCCCTTCGGCGGCCAGACCCGCAGCTACGAATGGGACAAGGGCGAATGCCCCTACTGCCGCGCCAAGGCCAAGGCCGACGCCGGTTTTGAAATCATGCAAAAGCTGGGCATCGAATACTTCTGCTTCCACGATGTGGACCTTATCGAGGATTGCGATGACATTGCCGAATACGAAGCCCGCATGAAGGACATCACGGACTACCTGCTGGAAAAGATGAAGGAGACCGGCATCAAGAACCTCTGGGGCACCGCCAATGTCTTCGGCCACAAGCGCTACATGAACGGCGCCGGCACCAATCCGCAGTTCGATGTGGTGGCCCGTGCCGCCGTCCAGATCAAGAACGCCCTGGACGCCACCATCAAGCTGGGCGGCTCCAACTATGTGTTCTGGGGCGGCCGCGAAGGCTATTACACCCTCCTCAACACACAGATGCAGCGGGAAAAAGACCACCTGGCCAAGTTGCTGACGGCCGCCCGCGACTATGCCCGCGCCAAGGGCTTCAAGGGCACCTTCCTCATTGAGCCCAAACCCATGGAACCCACCAAGCACCAGTACGACGTGGATACGGAGACGGTCATCGGCTTCCTCCGTGCCAACGGCCTGGACAAGGACTTCAAGGTGAACATCGAGGTGAACCACGCCACCCTGGCCGGCCACACCTTCGAGCATGAGCTCACCGTGGCCCGCGAGAACGGTTTCCTGGGCTCCATCGATGCCAACCGCGGCGACGCCCAGAACGGCTGGGACACGGACCAGTTCCCTGTGGACCCGTACGATCTTACCCAGGCCATGATGCAGGTGCTGCTGAACGGCGGCTTCGGCAACGGCGGCACCAACTTCGACGCCAAACTCCGCCGCTCCTCCACCGACCCTGAGGACATCTTCATCGCCCATATTTCCGCCATGGATGCCATGGCCCACGCTTTGCTTAACGCAGCTGCCGTGCTGGAAGAGAGCCCCCTGTGCCAGATGGTCAAGGAGCGTTATGCCAGCTTCGACGATGGCCTCGGCAAACAGTTCGAGGAAGGCAAGGCTACCCTGGAAGACCTGTACGAATACGCCAAGGCCCAGGGTGAACCCGTTGTCGCCTCCGGCAAGCAGGAGCTTTACGAGACTCTCCTGAACCTGTATGCCGTCAAGTAA 5607MI5_Prevotella Amino  136MAKEYFPSIGKIPFEGADSKNPLAFHYYDAGRVVMGKPMKEWLKFAMAWWHTLGQASGDP 002 AcidFGGQTRSYEWDKGECPYCRAKAKADAGFEIMQKLGIEYFCFHDVDLIEDCDDIAEYEARMKDITDYLLEKMKETGIKNLWGTANVFGHKRYMNGAGTNPQFDVVARAAVQIKNALDATIKLGGSNYVFWGGREGYYTLLNTQMQREKDHLAKLLTAARDYARAKGFKGTFLIEPKPMEPTKHQYDVDTETVIGFLRANGLDKDFKVNIEVNHATLAGHTFEHELTVARENGFLGSIDANRGDAQNGWDTDQFPVDRYDLTQAMMQVLLNGGFGNGGTNFDAKLRRSSTDPEDIFIAHISAMDAMAHALLNAAAVLEESPLCQMVKERYASFDDGLGKQFEEGKATLEDLYEYAKAQGEPVVASGKQELYETLLNLYAVK 5607MI6_ Prevotella DNA 137ATGACCAAAGAATATTTCCCTACCGTCGGGAAGATCCCCTTCGAGGGCCCCGAAAGCAAG 002AACCCTATGGCGTTCCATTACTATGACCCCAACCGTCTGGTGATGGGCAAGAAGATGAAAGACTGGCTGCGTTTCGCCATGGCCTGGTGGCACACCCTCGGCCAGGCGTCGGGCGACCAGTTCGGCGGCCAGACCCGCAGTTATGCGTGGGACGAGGGAGAATGCCCGTACGAGCGCGCCCGTGCCAAGGCTGACGCCGGCTTCGAGATCATGCAGAAACTCGGTATCGAGTTCTTCTGCTTCCACGACATCGACCTGATCGAGGATACCGACGACATCGCCGAGTATGAGGCCCGCCTGAAAGACATCACGGACTATCTGCTCGAGAAGATGAAAGCCACTGGCATCAAAAATCTCTGGGGAACGGCCAACGTGTTCGGCCACAAGCGTTGCATGAACGGCGCCGCCACCAACCCGGACTTCGCCGTGCTGGCCCGCGCTGCCGTCCAGATCAAGAACGCCATCGACGCCACCATCAAGCTGGGCGGCGAGAACTATGTGTTCTGGGGTGGCCGCGAAGGCTACACGAGCCTGCTCAACACCCAGATGCAGCGTGAGAAAGAGCACCTGGGCCGCCTGCTGTCCCTGGCCCGCGACTATGGCCGCGCCCACGGCTTCAAGGGTACCTTCCTGATCGAGCCCAAGCCGATGGGACCGACGAAACACCAGTACGACCAGGATACGGAAACTGTCATCGGTTTCCTGCGCCGCCACGGTCTAGACAAGGACTTCAAGGTCAATATCGAGGTGAACCATGCCACGCTGGCGGGCCACACCTTCGAACACGAACTGGCCTGCGCCGTGGATCACGGTATGCTGGGCAGCATCGACGCCAACCGCGGTGACGCACAGAACGGCTGGGATACCGACCAGTTCCCGATCGACAACTTCGAGCTGACCCTTTCCATGCTCCAGATCATCCGCAACGGTGGCCTGGCACCCGGCGGCTCGAATTTCGATGCCAAGCTGCGCCGCAACTCCACCGATCCCGAAGACATTTTCATCGCGCACATCAGCGCCATGGACGCCATGGCCCGCGCATTGGTCAATGCGGCCGCCATCCTGGAGGAGAGCGCTATTCCGAAGATGGTCAAGGAGCGTTACGCTTCGTTCGACAGCGGCAAAGGCAAGGAATACGAGGAAGGCAAGCTGACGCTCGAAGACATCGTGGCCTATGCCAAGGCGAACGGAGAACCGAAGCAGATTTCCGGCAAACAGGAACTCTACGAGACGCTTGTCGCACTCTATAGCAAATAA 5607MI6_Prevotella Amino  138MTKEYFPTVGKIPFEGPESKNPMAFHYYDPNRLVMGKKMKDWLRFAMAWWHTLGQASGDQ 002 AcidFGGQTRSYAWDEGECRYERARAKADAGFEIMQKLGIEFFCFHDIDLIEDTDDIAEYEARLKDITDYLLEKMKATGIKNLWGTANVFGHKRCMNGAATNPDFAVLARAAVQIKNAIDATIKLGGENYVFWGGREGYTSLLNTQMQREKEHLGRLLSLARDYGRAHGFKGTFLIEPKPMGPTKHQYDQDTETVIGFLRRHGLDKDFKVNIEVNHATLAGHTFEHELACAVDHGMLGSIDANRGDAQNGWDTDQFPIDNFELTLSMLQIIRNGGLAPGGSNFDAKLRRNSTDPEDIFIAHISAMDAMARALVNAAAILEESAIPKMVKERYASFDSGKGKEYEEGKLTLEDIVAYAKANGEPKQISGKQELYETLVALYSK 5607MI7_ Prevotella DNA 139ATGACCAAAGGGTATTTCCCTACCATCGGCAGGATTCCCTTCGAGGGAACTGAAAGCAAG 002AATCCCCTCGCATTCCATTACTATGAGCCCGACCGGCTCGTACTGGGCAAGAAAATGAAAGACTGGCTGCGTTTCGCGATGGCCTGGTGGCACACCCTGGGCCAGGCGTCCGGCGACCAGTTCGGCGGCCAGACCCGCAGCTATGCCTGGGACAAGGCCGAGTGCCCCTATGAGCGCGCCAAGGCCAAAGCCGACGCCGGCTTCGAGATCATGCAGAAACTCGGCATCGAGTTCTTCTGTTTCCACGACATTGACCTCGTTGAGGATACCGACGACATCGCCGAGTATGAGGCCCGGATGAAGGACATTACCGACTATCTCCTGGTCAAGATGAAGGAGACCGGAATCAAGAACCTCTGGGGTACGGCCAATGTCTTCGGCCACAAGCGCTATATGAACGGCGCCGCCACCAATCCCGACTTCGACGTGGIGGCCCGCGCCGCCGTCCAGATCAAGAACGCCCTCGATGCCACCATCAAGCTGGGCGGTGAAAACTATGTGTTCTGGGGCGGCCGCGAAGGCTATATGAGCCTGCTCAACACGCAGATGCAGCGTGAGAAGGAGCACCTGGGCCGGATGCTGGTCGCCGCCCGCGACTACGCCCGCGCCCACGGCTTCAAGGGTACCTTCCTCATCGAGCCCAAACCGATGGAACCGACCAAGCACCAGTACGACCAGGATACGGAAACCGTGATCGGCTTCCTTCGCCGCCACGGCCTGGACAAGGATTTCAAGGTGAACATCGAAGTGAACCACGCCACGCTGGCCGGCCACACCTTCGAGCACGAACTGGCCACCGCCGTCGACTGCGGCCTGCTGGGCAGCATCGACGCCAATCGCGGCGACGCTCAGAACGGCTGGGATACCGACCAGTTCCCGATCGACAACTTCGAACTCACGCTGGCCATGCTGCAGATTATCCGCAACGGCGGTCTGGCACCCGGCGGCTCGAACTTCGACGCCAAACTGCGCCGTAACTCCACCGATCCGGAAGATATCTTCATCGCCCACATCAGTGCGATGGACGCGATGGCCCGTGCGCTGGTCAACGCCGCCGCAATCTGGGAAGAGTCTCCCATCCCGCAGATGAAGAAAGAACGCTACGCGTCGTTCGACAGCGGCAAGGGCAAGGAATTCGAAGAGGGCAAGCTCTGCCTCGAAGACCTCGTGGCCTATGCCAAGGCGAACGGAGAACCGAAACAGATCTCCGGCAGGCAGGAACTATATGAGACCATCGTCGCCCTTTATTGCAAATAG 5607MI7_Prevotella Amino  140MTKGYFPTIGRIPFEGTESKNPLAFHYYEPDRLVLGKKMKDWLRFAMAWWHTLGQASGDQ 002 AcidFGGQTRSYAWDKAECPYERAKAKADAGFEIMQKLGIEFFCFHDIDLVEDTDDIAEYEARMKDITDYLLVKMKETGIKNLWGTANVFGHKRYMNGAATNPDFDVVARAAVQIKNALDATIKLGGENYVFWGGREGYMSLLNTQMQREKEHLGRMLVAARDYARAHGFKGTFLTEPKPMEPTKHQYDQDTETVIGFLRRHGLDKDFKVNIEVNHATLAGHTFEHELATAVDCGLLGSIDANRGDAQNGWDTDQFPIDNFELTLAMLQIIRNGGLAPGGSNFDAKLRRNSTDPEDIFIAHISAMDAMARALVNAAAIWEESPTPQMKKERYASFDSGKGKEFEEGKLCLEDLVAYAKANGEPKQISGRQELYETIVALYCK 5608MI1_ Prevotella DNA 141ATGACCAACGAGTATTTTCCCGGAATCGGTGTGATTCCGTTTGAAGGACAGGAAAGCAAG 004AATCCCATGGCTTTCCATTATTATGACGCCAACCGCGTAGTGATGGGCAAACCCATGAAGGAATGGTTCAAATTTGCCATGGCCTGGTGGCATACGCTGGGGCAGGCATCGGCCGATCCCTTCGGCGGACAGACCCGCTCCTACGCATGGGACAAGGGCGAGTGCCCTTACTGCCGTGCCCGCCAGAAGGCCGACGCCGGCTTTGAACTGATGCAGAAGCTGGGTATCGGCTATTTCTGCTTCCACGATGTGGATATCATCGAGGACTGCGAAGACATTGCCGAGTATGAGGCCCGTATGAAGGACATCACGGACTATCTGCTGGTGAAGATGAAGGAAACGGGCATCAAGAACCTGTGGGGCACGGCCAACGTCTTCGGCCACAAGCGCTATATGAACGGCGCTGCCACCAACCCGCAGTTCGACGTGGTGGCCCGCGCTGCGGTCCAGATCAAGAACGCCCTGGACGCCACCATCAAGCTGGGCGGCAGCAATTACGTGTTCTGGGGCGGCCGCGAAGGCTATTATACCCTTTGGAACACGCAGATGCGGCGGGAGAAGGACCACCTGGCCCAGATGCTCAAGGCAGCCCGTGACTATGCCCGCGGCAAGGGATTCAAGGGCACGTTCCTCATTGAGCCCAAGCCCATGGAGCCCACCAAGCACCAGTACGACGTAGATACGGAGACCGTGATTGGCTTCCTGCGCGCAAACGGACTGGACAAGGACTTCAAGGTGAATATCGAAGTGAACCACGCCACCCTGGCCGGCCACACCTTCGAGCACGAACTCACCGTGGCCCGCGAAAACGGCTTCCTGGGCAGCATCGACGCCAACCGCGGAGACGCCCAGAACGGTTGGGATACAGACCAGTTCCCCATAGATGCCTTTGACCTCACCCAGGCCATGATGCAGGTCCTGCTCAACGGCGGATTCGGCAACGGCGGCACCAACTTCGACGCCAAACTGCGCCGTTCCTCCACGGATCCCGAGGACATCTTCATCGCCCACATCGGCGCCATGGACGCCATGGCCCACGCCCTCCTGAACGCCGCCGCCATCCTGGAAGAGAGCCCCATGCCGGGCATGGTGAAGGAGCGCTACGCTTCCTTCGACAATGGCCTTGGCAAGAAGTTCGAGGAAGGAAAGGCCACGCTGGAAGAGCTGTACGACTATGCCAAGAAGAACGGCGAGCCTGTGGCCGCTTCCGGCAAGCAGGAACTGTACGAAACGCTGCTGAACCTGTACGCCAAGTAA 5608MI1_Prevotella Amino  142MTNEYFPGIGVIPFEGQESKNPMAFHYYDANRVVMGKRMKEWFKFAMAWWHTLGQASADP 004 AcidFGGQTRSYAWDKGECPYCRARCKADAGFELMQKLGIGYFCFHDVDIIEDCEDIAEYEARMKDITDYLLVKMKETGIKNLWGTANVFGHKRYMNGAATNPQEDVVARAAVQIKNALDATIKLGGSNYVFWGGREGYYTLWNTQMRREKDHLAQMLKAARDYARGKGFKGTFLIEPKPMEPTKHQYDVDTETVIGFLRANGLDKDFKVNIEVNHATLAGHTFEHELTVARENGFLGSIDANRGDAQNGWDTDQFPIDAFDLTQAMMQVLLNGGFGNGGTNFDAKLRRSSTDPEDIFIAHIGAMDAMAHALLNAAAILEESPMPGMVKERYASFDNGLGKKFEEGKATLEELYDYAKKNGEPVAASGKQELYETLLNLYAK 5608MI2_ Prevotella DNA 143ATGAAAGAATACTTCCCTACCATCGGAAAAATCCCTTTCGAGGGCCCTCAGAGCAAGAAT 002CCGCTCGCATTCCATTACTATGACGCCAACCGCGTTGTCGCCGGCAAACCCATGAAGGACTGGCTCAAGTTCGCCATGGCTTGGTGGCACACCCTGGGCGCAGCATCGGCAGACCCCTTCGGCGGCCAGACCCGCAGCTACGAGTGGGACAAAGCCGAGTGCCCTTACTGCCGTGCCCGTGAAAAGGCCGACGCCGGCTTCGAGATCATGCAGAAACTTGGAATCGAGTACTTCTGCTTCCATGACATCGACCTTGTGGAAGACTGCGAGGACATTGCCGAGTACGAGGCCCGCATGAAGGACATCACGGACTACCTCCTGGAGAAGATGAAGGCCACCGGCATCAAGAACCTGTGGGGCACCGCCAACGTCTTTGGCAACAAGCGCTACATGAACGGCGCAGCCACCAACCCTCAGTTCGACATCGTTGCCCGTGCAGCTGTCCAGATCAAGAACGCCATCGACGCAACAATCAAGCTGGGCGGTACCGGTTACGTATTCTGGGGCGGCCGCGAGGGCTACTACACCCTCCTGAACACCCAGATGCAGCGCGAGAAGGACCACCTTGCCAAGATGCTCACCGCAGCCCGCGACTACGCCCGCGCCAAGGGATTCAAGGGCACATTCCTCATCGAGCCCAAGCCCATGGAGCCCACCAAGCACCAGTACGATGTTGACACGGAAACCGTCATCGGCTTCCTCCGCGCCAACGGCCTGGACAAGGACTTCAAGGTGAACATCGAGGTGAACCACGCCACCCTGGCCGGCCACACCTTCGAGCACGAGCTCACCGTGGCCGTGGACAACGGCTTCCTGGGCAGCATCGACGCAAACCGCGGCGACGCCCAGAACGGCTGGGACACTGACCAGTTCCCTGTGGATCCTTACGACCTCACCCAGGCAATGATGCAGATTATCCGCAACGGCGGCTTCAAGGACGGCGGCACCAACTTCGACGCCAAACTCCGCCGCAGCTCCACGGACCCCGAGGACATCTTCATCGCCCACATCAGCGCAATGGATGCAATGGCACACGCCCICATCAACGCTGCTGCAGTGCTTGAGGAAAGCCCTCTGTGCGAGATGGTTGCAAAGCGCTACGCCAGCTTTGACAGCGGTCTTGGCAAGAAGTTCGAGGAAGGCAAAGCCACTCTCGAGGAGATCTACGAGTATGCCAAGAAGGCCCCGGCACCCGTCGCCGCCTCCGGCAAGCAGGAGCTCTACGAGACACTGCTCAATCTGTACGCTAAATAA 5608MI2_Prevotella Amino  144MKEYFPTIGKIPFEGPQSKNPLAFHYYDANRVVAGKPMKDWLKFAMAWWHTLGAASADPF 002 AcidGGQTRSYEWDKAECPYCRAREKADAGFEIMQKLGIEYFCFHDIDLVEDCEDIAEYEARMKDITDYLLEKMKATGIKNLWGTANVFGNKRYMNGAATNRQFDIVARAAVQIKNAIDATIKLGGTGYVFWGGREGYYTLLNTQMQREKDHLAKMLTAARDYARAKGFKGTFLIEPKPMEPTKHQYDVDTETVIGFLRANGLDKDFKVNIEVNHATLAGHTFEHELTVAVDNGFLGSIDANRGDAQNGWDTDQFPVDRYDLTQAMMQIIRNGGFKDGGTNFDAKLRRSSTDPEDIFIAHISAMDAMAHALINAAAVLEESPLCEMVAKRYASFDSGLGKKFEEGKATLEEIYEYAKKAPAPVAASGKQELYEILLNLYAK 5608MI3_ Prevotella DNA 145ATGACCAAAGAGTATTTCCCTACAATCGGAAAGATTCCCTTCGAAGGCCCGGAGAGCAAG 004AATCCGCTGGCATTCCATTACTATGAACCCGACAGAATCATCCTCGGCAGGAAGATGAAGGACTGGCTGCGCTTCGCCGTGGCCTGGTGGCACACCCTCGGCCAGGCGTCCGGCGACCAGTTCGGAGGCCAGACCCGCAACTATGCGTGGGACGAGCCCGAATGCCCGGTAGAGCGCGCGAAAGCCAAGGCCGACGCCGGCTTCGAGCTGATGCAGAAGCTGGGCATCGAGTATTTCTGCTTCCACGACGTAGACCTCATAGAGGAGGCCGCAACCATCGAAGAATATGAGGAGCGCATGGGCATCATAACCGACTACCTGCTCGGGAAGATGAAGGAGACAGGTATCAAGAACCTCTGGGGCACCGCCAACGTGTTCGGCCACAAGCGTTACATGAACGGAGCCGCCACCAACCCCGACTTCGACGTGGIGGCCCGTGCGGCCGTGCAGATCAAGAACGCCATCGACGCCACCATCAAGCTGGGCGGCGAGAATTACGTATTCTGGGGCGGACGCGAGGGCTATGCAAGCCTGCTCAACACTCAGATGCAGCGCGAGAAAGACCACCTGGGACGCATGCTGGCTGCAGCCCGCGACTATGGCCGCGCCCACGGATTCAAGGGCACTTTCCTCATCGAGCCCAAACCCATGGAGCCTACCAAGCACCAGTACGACCAGGATACCGAGACCGTTATCGCCTTCCTGCGCAGGAACGGCCTCGACAAGGATTTCAAGGTAAACATCGAGGTGAACCACGCCACCCTGGCGGGCCACACCTTCGAGCACGAACTGGCGGTGGCAGTGGACAACGGCCTGCTTGGCAGCATCGACGCCAACCGCGGCGACGCGCAGAACGGATGGGACACCGACCAGTTCCCCATCGACAACTTCGAGCTCACCCAGGCCATGCTGCAGATAATCCGCAACGGCGGACTGGGAACCGGCGGATCGAACTTCGACGCCAAGCTGCGCCGCAATTCCACCGACCCTGAGGATATCTTCATCGCCCACATCAGTGCGATGGACGCCATGGCACGCGCGCTGGCAAACGCCGCCGCAATCATCGAAGAGAGCCCCATCCCCGCAATGCTGAAGGAGCGCTACGCATCGTTCGACAGCGGCAAGGGCAAGGAGTTCGAGGACGGCAAACTGAGCCTCGAAGAACTGGTAGCCTACGCCAAGGCGAACGGCGAGCCGAAGCAGATTTCCGGCAAGCAGGAACTCTACGAAACCATAGTGGCCCTCTATTGCAAGTAA 5608MI3_Prevotella Amino  146MTKEYFPTIGKIPFEGPESKNPLAFHYYEPDRIILGRKMKDWLRFAVAWWHTLGQASGDQ 004 AcidFGGQTRNYAWDEPECPVERAKAKADAGFELMQKLGIEYFCFHDVDLIEEAATIEEYEERMGIITDYLLGKMKETGIKNLWGTANVFGHKRYMNGAATNPDFDVVARAAVQIKNAIDATIKLGGENYVFWGGREGYASLLNTQMQREKDHLGRMLAAARDYGRAHGFKGTFLTEPKPMEPTKHQYDQDTETVIAFLRRNGLDKDFKVNIEVNHATLAGHTFEHELAVAVDNGLLGSIDANRGDAQNGWDTDQFPIDNFELTQAMLQIIRNGGLGTGGSNFDAKLRRNSTDPEDIFIAHISAMDAMARALANAAAIIEESPTPAMLKERYASFDSGKGKEFEDGKLSLEELVAYAKANGEPKQISGKQELYETIVALYCK 5609MI1_ Prevotella DNA 147ATGGCACAAGAATACTTCCCTACCATTGGGAAAATCCCCTTCGAGGGCACTGAGAGCAAG 005AATCCCCTTGCTTTCCATTACTATGAGCCGGAGCGCATTGTCTGCGGCAAACCCATGAAAGAATGGCTCAAGTTTGCCATGGCCTGGTGGCACACGCTGGGGCAGGCATCGGCCGATCCCTTCGGCGGCCAAACCCGCAGCTATGCCTGGGATAAGGGCGAATGCCCCTACTGCCGTGCCCGCGCCAAGGCGGACGCCGGCTTCGAGATTATGCAAAAGCTGGGCATCGAGTACTTCTGCTTCCACGATATCGACCTGGTAGAAGACTGTGACGATATTGCGGAATACGAAGCCCGCATGAAGGACATCACGGACTACCTCCTGGAGAAGATGAAGGAAACCGGTATCAAGAACCTCTGGGGCACCGCCAATGTGTTTGGTCACAAGCGCTACATGAACGGCGCCGCCACCAACCCGCAGTTTGACGTAGTGGCCCGTGCCGCTGTTCAGATTAAGAACGCCATTGACGCCACCATCAAGTTGGGCGGTGCCAATTACGTGTTCTGGGGCGGCCGCGAGGGCTATTACAGCCTCCTGAACACCCAGATGCAGCGGGAGAAGGACCACCTGGCCAAGCTGCTCACGGCAGCCCGCGACTATGCCCGCGCCAACGGCTTCAAGGGAACCTTCCTGATTGAGCCCAAGCCCATGGAGCCCACCAAGCACCAGTACGACGTGGATACGGAGACGGTCATTGGCTTCCTCCGCGCCAACGGCCTGGACAAGGACTTCAAGGTGAATATCGAGGTGAACCACGCCACGTTGGCCGGCCACACCTTTGAGCACGAGCTCACCGTGGCCCGCGAGAACGGCTTCCTGGGCAGCATCGACGCCAACCGCGGCGATGCCCAGAACGGCTGGGATACGGACCAGTTCCCGGTAGACGCTTATGAGCTCACCCAGGCCATGATGCAGGTGCTCCTGAACGGAGGCTTCGGCAACGGCGGCACCAACTTCGACGCCAAGCTGCGCCGCTCCTCCACGGACCCGGAGGACATCTTCATCGCCCATATCAGTGCGATGGATGCCATGGCCCACGCCCTGCTCAACGCCGCCGCCGTGCTGGAGGAAAGCCCCCTGTGCCAGATGGTGAAGGAGCGCTACGCCAGCTTTGACAGCGGTCCGGGCAAGCAGTTCGAGGAAGGAAAGGCCACCCTGGAGGACCTGTACAACTACGCCAAAGCCACCGGTGAACCCGTGGTTGCCTCCGGCAAGCAGGAACTTTACGAGACCCTCCTGAACCTCTATGCAAAGTAG 5609MI1_Prevotella Amino  148MAQEYFPTIGKIPFEGTESKNPLAFHYYEPERIVCGKPMKEWLKFAMAWWHTLGQASADP 005 AcidFGGQTRSYAWDKGECPYCRARAKADAGFEIMQKLGIEYFCFHDIDLVEDCDDIAEYEARMKDITDYLLEKMKETGIKNLWGTANVFGHKRYMNGAATNPQFDVVARAAVQIKNAIDATIKLGGANYVFWGGREGYYSLLNTQMQREKDHLAKLLTAARDYARANGFKGTFLIEPKPMEPTKHQYDVDTETVIGFLRANGLDKDFKVNIEVNHATLAGHTFEHELTVARENGFLGSIDANRGDAQNGWDTDQFPVDAYELTQAMMQVLLNGGFGNGGTNFDAKLRRSSTDPEDIFIAHISAMDAMAHALLNAAAVLEESPLCQMVKERYASFDSGPGKQFEEGKATLEDLYNYAKATGEPVVASGKQELYETLLNLYAK 5610MI1_ Prevotella DNA 149ATGGCACAAGAATACTTCCCTACCATTGGGAAAATCCCCTTCGAGGGCACTGAGAGCAAG 003AATCCCCTTGCTTTCCATTACTATGAGCCGGAGCGCATTGTCTGCGGCAAACCCATGAAAGAATGGCTCAAGTTTGCCATGGCCTGGTGGCACACGCTGGGGCAGGCATCGGCCGATCCCTTCGGCGGCCAAACCCGCAGCTATGCCTGGGATAAGGGCGAATGCCCCTACTGCCGTGCCCGTGCCAAGGCGGACGCCGGTTTTGAGATTATGCAAAAGCTGGGCATCGAGTACTTCTGCTTCCACGATATCGACCTGGTAGAAGACTGTGACGATATTGCGGAATACGAAGCCCGCATGAAGGACATCACGGACTACCTCCTGGAGAAGATGAAGGAAACCGGCATCAAGAACCTCTGGGGCACCGCCAATGTGTTTGGTCACAAGCGCTACATGAACGGCGCCGGCACCAATCCGCAGTTTGACGTGGTGGCCCGTGCTGCCGTGCAAATCAAGAACGCCATTGACGCCACCATCAAGTTGGGCGGTGCCAATTACGTGTTCTGGGGCGGCCGCGAGGGCTATTACAGCCTCCTGAACACCCAGATGCAGCGGGAGAAGGACCACCTGGCCAAGCTGCTCACGGCAGCCCGCGACTATGCCCGCGCCAACGGCTTCAAGGGAACCTTCCTGATTGAGCCCAAGCCCATGGAGCCCACCAAGCACCAGTACGACGTGGATACGGAGACGGTCATTGGCTTCCTCCGCGCCAACGGCCTGGACAAGGACTTCAAGGTGAATATCGAGGTGAACCACGCCACGCTGGCCGGCCACACCTTTGAGCACGAACTCACCGTGGCCCGCGAGAACGGCTTCCTGGGCAGCATCGACGCCAACCGCGGCGATGCCCAGAACGGCTGGGATACGGACCAGTTCCCGGTAGACGCTTATGAGCTCACCCAGGCCATGATGCAGGTGCTCCTGAACGGAGGCTTCGGCAACGGCGGCACCAACTTCGACGCCAAGCTGCGCCGCTCCTCCACGGACCTGGAGGACATCTTCATCGCCCATATCAGTGCGATGGATGCCATGGCCCACGCCCTGCTCAACGCCGCCGCCGTGCTGGAGGAAAGCCCCCTGTGCCAGATGGTGAAGGAGCGCTACGCCAGCTTTGACAGCGGTCCGGGCAAGCAGTTCGAGGAAGGAAAGGCCACCCTGGAGGACCTGTACAACTACGCCAAAGCCAACGGTGAACCCGTGGTTGCCTCCGGCAAGCAGGAACTTTACGAGACCCTCCTGAACCTCTATGCAAAGTAG 5610MI1_Prevotella Amino  150MAQEYFPTIGKIPFEGTESKNPLAFHYYEPERIVCGKPMKEWLKFAMAWWHTLGQASADP 003 AcidFGGQTRSYAWDKGECPYCRARAKADAGFEIMQKLGIEYFCFHDIDLVEDCDDIAEYEARMKDITDYLLEKMKETGIKNLWGTANVFGHKRYMNGAGTNPQEDVVARAAVQIKNAIDATIKLGGANYVFWGGREGYYSLLNTQMQREKDHLAKLLTAARDYARANGFKGTFLIEPKPMEPTKHQYDVDTETVIGFLRANGLDKDFKVNIEVNHATLAGHTFEHELTVARENGFLGSIDANRGDAQNGWDTDQFPVDAYELTQAMMQVLLNGGFGNGGTNFDAKLRRSSTDLEDIFTAHISAMDAMAHALLNAAAVLEESPLCQMVKERYASFDSGPGKQFEEGKATLEDLYNYAKANGEPVVASGKQELYETLLNLYAK 5610MI2_ Prevotella DNA 151ATGGCAAAAGAATATTTCCCTACCATCGGCAAGATTCCTTTTGAAGGAACCGACAGCAAG 004AGTCCCCTCGCCTTCCATTACTATGACGCCCAGCGCGTTGTGATGGGCAAACCCATGAAGGAATGGCTCAAGTTCGCCATGGCCTGGTGGCACACCCTGGGCCAGGCATCGGCCGACCCCTTCGGCGGTCAGACCCGCCACTATGCCTGGGATGAAGGCGAATGCCCCTACTGCCGCGCCAAAGCCAAGGCCGACGCCGGCTTCGAGATCATGCAGAAACTGGGCATCGAGTACTTCTGCTTCCACGATGTGGACCTGGTGGAAGACTGCGACGACATCGCCGAGTACGAAGCCCGCATGAAGGACATCACGGACTACCTGCTGGAGAAGATGAAGGAAACCGGCATCAAGAACCTCTGGGGCACGGCCAATGTGTTCGGCCACAAGCGTTACATGAACGGCGCCGGGACCAACCCGCAGTTTGACATTGTGGCCCGCGCTGCCGTCCAGATCAAAAACGCCCTGGACGCCACCATCAAGCTGGGCGGTTCCAACTACGTGTTCTGGGGCAGCCGCGAAGGCTACTACACCCTCCTGAACACCCAGATGCAGCGGGAGAAAGACCACCTGGCCAAGCTCCTGACCGCCGCCCGCGACTACGCCCGCGCCAAAGGCTTCAAGGGAACCTTCCTCATCGAGCCCAAACCCATGGAGCCCACCAAGCACCAGTACGACGTGGACACCGAGACCGTAATCGGCTTCCTGCGTGCCAACGGCCTGGACAAGGACTTCAAGGTGAACATCGAGGTGAACCACGCCACCCTGGCTGGCCACACCTICGAGCACGAACTCACCGTCGCCCGTGAAAACGGCTTCCTCGGATCGATCGACGCCAACCGCGGCGACGCCCAGAACGGCTGGGACACCGACCAGTTCCCCGTAGACGCCTATGACCTCACCCAGGCCATGATGCAGGTGCTGCTGAACGGCGGTTTCGGCAATGGCGGTACCAACTTCGACGCCAAGCTCCGCCGCTCCTCCACGGATCCGGAAGACATCTTCATCGCCCACATCAGCGCCATGGACGCCATGGCCCACGCCCTGCTGAACGCCGCCGCCGTGCTGGAAGAAAGCCCGCTTCCCGCCATGGCGAAAGAGCGCTACGCCTCCTTTGACAGCGGACTTGGCAAGAAGTTCGAAGAGGGAAAGGCCACCCTCGAAGAGCTGTACGACTATGCCAAGGCTAACGACGCCCCTGTCGCCGCCTCCGGCAAGCAGGAACTTTACGAAACCTTCTTGAACCTCTATGCAAAATAG 5610MI2_Prevotella Amino  152MAKEYFPTIGKIPFEGTDSKSPLAFHYYDAQRVVMGKPMKEWLKFAMAWWHTLGQASADP 004 AcidEGGQTRHYAWDEGECPYCRAKAKADAGFEIMQKLGIEYFCFHDVELVEDCDDIAEYEARMKDITDYLLEKMKETGIKNLWGTANVFGHKRYMNGAGTNPQFDIVARAAVQIKNALDATIKLGGSNYVFWGSREGYYTLLNTQMQREKDHLAKLLTAARDYARAKGFKGTFLIEPKPMEPTKHQYDVDTETVIGFLRANGLDKDFKVNIEVNHATLAGHTFEHELTVARENGFLGSIDANRGDAQNGWDTDQFPVDAYDLTQAMMQVLLNGGFGNGGTNFDAKLRRSSTDPEDIFIAHISAMDAMAHALLNAAAVLEESPLPAMAKERYASFDSGLGKKFEEGKATLEELYDYAKANDAPVAASGKQELYETFLNLYAK 5751MI1_ Prevotella DNA 153ATGGCAAAACAGTATTTTCCGCAAATCGGAAAGATTAAATTCGAAGGAACAGAGAGCAAG 003AATCCGCTTGCGTTCCATTATTATGACGCAAACAGGGTAGTCCTCGGAAAGGCAATGGAGGAGTGGCTCAAGTTCGCAATGGCTTGGTGGCATACTCTCGGACAGGCTTCCGGAGACCAGTTCGGCGGCCAGACCCGCAGCTACGAGTGGGATCTTGCAGCCACCCCCGAGCAGCGCGCAAAGGACAAGCTCGACGCCGGCTTCGAAATAATGGAGAAACTTGGAATCAAGTATTTCTGTTTCCACGATGTTGACCTTATCGAAGACAGCGACGATATTGCGACATATGAGGCTCGTCTCAAGGACCTTACAGACTACGCTGCAGAGCAGATGAAGCTCCACGACATCAAGCTCCTCTGGGGTACAGCGAATGTATTCGGCAACAAGCGCTACATGAACGGTGCGGCTACAAACCCTGATTTCGATGTAGTTGCCCGCGCAGCCGTTCAGATTAAGAACGCTATCGACGCGACCATCAAGCTCGGTGGTACCAGCTATGTATTCTGGGGCGGTCGTGAGGGATATCAGAGCCTGCTCAACACTCAGATGCAGCGTGAGAAGGACCACCTCGCAACCATGCTTACAATCGCTCGCGACTATGCTCGCAGCAAGGGCTTTACCGGAACCTTCCTTATCGAGCCTAAGCCGATGGAGCCTACAAAACACCAGTACGACGTAGATACAGAGACTGTTGTCGGCTTCCTCAAGGCACACGGCCTGGACAAGGACTTCAAGGTAAATATCGAGGTTAACCACGCAACTCTCGCAGGCCACACCTTCGAGCACGAACTCACCGTTGCTGTGGATAACGGAATGCTCGGTTCTATCGACGCTAACCGCGGTGATGCACAGAACGGCTGGGATACAGACCAGTTCCCTGTAAGCGCTGAGGAGCTTACCCTCGCTATGATGCAGATTATCCGTAATGGTGGCCTTGGCAACGGAGGATCCAACTTCGACGCAAAGCTTCGCCGCAACTCTACCGATCCTGAAGACATCTTCATCGCACACATCTGCGGTATGGATGCAATGGCACACGCTCTCCTCAATGCAGCTGCAATTATCGAGGAGTCTCCTATCCCTACAATGGTTAAGGAGCGTTACGCTTCCTTCGACAGCGGTATGGGTAAGGACTTCGAGGATGGAAAGCTTACCCTCGAGGATCTCTACAGCTACGGCGTGAAGAACGGAGAGCCAAAGCAGACCAGCGCAAAGCAGGAGCTCTATGAGACTCTCATGAATATCTATTGCAAGTAA 5751MI1_Prevotella Amino  154MAKQYFPQIGKIKFEGTESKNPLAFHYYDANRVVLGKAMEEWLKFAMAWWHTLGQASGDQ 003 AcidFGGQTRSYEWDLAATPEQRAKDKLDAGFEIMEKLGIKYFCPHDVDLIEDSDDIATYEARLKDLTDYAAEQMKLEDIKLLWGTANVFGNKRYMNGAATNPDFDVVARAAVQIKNAIDATIKLGGTSYVFWGGREGYQSLLNIQMQREKDHLATMLTIARDYARSKGFTGTFLIEPKPMEPTKHQYDVDTETVVGFLKAHGLDKDFKVNIEVNHATLAGHTFEHELTVAVDNGMLGSIDANRGDAQNGWDTDQFPVSAEELTLAMMQIIRNGGLGNGGSNFDAKLRRNSTDPEDIFTAMICGMDAMAHALLNAAAIIEESPIPTMVKERYASFDSGMGKDFEDGKLTLEDLYSYGVKNGEPKQTSAKQELYETLMNIYCK 5751MI2_ Prevotella DNA 155ATGGCAAAAGAATTTTTTCCACAAGTAGGCAAGATTCCATTTGAGGGTCCTGAAAGTACT 003AACGTACTCGCATTCCACTACTATGATCCAGAACGCGAAGTTCTTGGTAAGAAAATGAAAGATTGGCTGAAGTATGCTATGGCTTGGTGGCACACACTCGGTCAGGCAAGTGGCGACCAATTCGGTCTTCAAACTCGTTCGTATGAATGGGATGAAGCCGACGATGTTCTTCAACGCGCAAAGGATAAAATGGATGCTGGTTTTGAATTGATGACCAAACTTGGCATTGAATACTACTGCTTCCATGATGTCGACCTTATTGAAGAAGGTGCAACAATTGAAGAATATGAAGCTCGTATGCAAGCTATCACCGACTACGCATTAGAAAAACAAAAAGAAACCGGCATTAAGCTCCTTTGGGGTACTGCTAATGTGTTTGGTCATAAGCGTTATATGAATGGTGCGGCAACAAACCCTGACTTTGATGTAGTGGCTCGCGCTGCTGTACAAATCAAGAACGCTATCGATGCAACTATCAAGCTTGGTGGTCAAAACTATGTATTCTGGGGTGGCCGCGAAGGTTATATGAGTTTGCTCAACACTCAAATGCAACGCGAAAAAGACCACTTGGCAAAGATGCTTACCGCAGCTCGCGACTATGCTCGTGCTAAGGGCTTCAAGGGTACATTCCTCGTTGAACCTAAGCCTATGGAACCAACTAAGCATCAATATGATACCGATACAGAAACTGTGATTGGTTTCCTCCGTGCAAATGGTCTTGAAAAAGACTTCAAGGTGAACATTGAAGTGAACCATGCTACTCTCGCTCAGCACACTTTCGAACACGAACTCGCTGTGGCTGTCGACAATGGCATGCTCGGTTCTATCGACGCTAACCGTGGCGATGCTCAAAATGGCTGGGATACCGACCAATTCCCAATCGACAACTACGAACTCACCCTCGCTATGCTCCAAATCATTCGCAATGGTGGTCTTGGCAATGGCGGTAGCAACCTCGACGCTAAGATTCGTCGTAATAGCACCGACCTTGAAGACCTCTTTATCGCTCACATCAGTGGTATGGATGCTATGGCTCGTGCACTTCTCAATGCTGCTGCAATCGTTGAAAAGAGCGAAATTCCTGCTATGTTGAAGCAGCGTTATGCAAGCTCTGATGCAGGTATGGGTAAGGACTTCGAAGAAGGAAAACTCACTCTCGAACAACTCGTAGACTATGCTAAGGCTAACGGCGAACCTGCTACAGTAAGCGGCAAGCAAGAAAAGTATGAAACTCTCGTTGCTCTCTACGCTAAGTAA 5751MI2_Prevotella Amino  156MAKEFFPQVGKIPFEGPESTNVLAFHYYDPEREVLGKKMKDWLKYAMAWWHTLGQASGDQ 003 AcidFGGQTRSYEWDEADDVLQRAKDKMDAGFELMTKLGIEYYCFHDVDLIEEGATIEEYEARMQAITDYALEKQKETGIKLLWGTANVEGHKRYMNGAATNPDFDVVARAAVQIKNAIDATIKLGGQNYVFWGGREGYMSLLNTQMQREKDHLAKMLTAARDYARAKGFKGTFLVEPKPMEPTKHQYDTDTETVIGFLRANGLEKDFKVNIEVNHATLAQHTFEHELAVAVDNGMLGSIDANRGDAQNGWDTDQFPIDNYELTLAMLQIIRNGGLGNGGSNLDAKIRRNSTDLEDLFIAHISGMDAMARALLNAAAIVEKSEIPAMLKQRYASSDAGMGKDFEEGKLTLEQLVDYAKANGEPATVSGKQEKYETLVALYAK 5752MI1_ Prevotella DNA 157ATGACTAAAGAGTATTTCCCGGGAATCGGAAAGATTCCGTTTGAAGGAACCAAGAGCAAG 003AACCCCCTGGCCTTCCATTATTATAACGCCTCCCAGGTAGCGATGGGCAAGCCCATGAAGGACTGGCTCAAGTATGCCATGGCCTGGTGGCACACCCTGGGCCAGGCCTCTGCAGACCCCTTTGGCGGCCAGACCCGCTCCTACGAATGGGACAAGGGCGAGTGCCCTTATTGCCGCGCCAAGCAGAAGGCCGATGCCGGCTTTGAGCTCATGCAGAAGCTGGGCATCGAGTACTACTGCTTCCACGACGTGGACATCATCGAGGACTGCGAGGACATTGCCGAGTACGAGGCCCGCATGAAGGACATCACGGACTACCTGCTGGAGAAGCAGAAAGAGACCGGCATCAAGAACCTCTGGGGCACCGCCAACGTGTTTGGCCACAAGCGCTACATGAACGGCGCCGCCACCAACCCTCAGTTTGACATTGIGGCCCGTGCCGCCGTCCAGATCAAGAACGCCCTGGATGCCACCATCAAGCTGGGTGGTACCAACTACGTGTTCTGGGGTGGCCGCGAAGGCTACTACACGCTGCTCAACACCCAGATGCAGCGGGAGAAGAACCACCTGGCCAAGATGCTCACCGCCGCCCGCGACTACGCCCGCGCCAAGGGCTTCAAGGGCACCTTCCTCATTGAGCCCAAACCCATGGAGCCCACCAAGCACCAGTACGACGTGGACACCGAGACCGTGATTGGTTTCATCCGCGCCAACGGCCTGGACAAGGACTTCAAGGTAAACATTGAGGTAAACCACGCCACCCTGGCCGGCCACACCTTTGAGCACGAGCTCACCGTGGCCCGCGAGAACGGCTTCCTGGGCTCCATCGACGCCAACCGCGGAGATGCCCAGAACGGCTGGGATACGGACCAGTTCCCCATCGACGCCCTGGATCTCACCCAGGCTATGATGCAGGTCATCCTCAACGGIGGCTTCGGCAATGGCGGCACCAACTTTGACGCCAAGCTCCGCCGCTCCTCCACCGATCCCGAGGACATCTTCATCGCCCACATCAGCGCCATGGATGCCATGGCACACGCCCTCCTGAACGCAGCCGCCATCCTGGAAGAGAGCCCCCTGCCCGCCATGGTCAAGGAGCGTTACGCTTCCTTCGACAGCGGTCTGGGCAAGAAGTTCGAAGAAGGCAAGGCCTCCCTGGAAGAACTTTACGAATATGCCAAGAAGAATGGAGAGCCCGTGGCCGCTTCCGGCAAACAGGAGCTCTGCGAAACTTACTTGAACCTCTATGCAAAGTAG 5752MI1_Prevotella Amino  158MTKEYFPGIGKIPFEGTKSKNPLAFHYYNASQVAMGKPMKDWLKYAMAWWHTLGQASADP 003 AcidFGGQTRSYEWDKGECPYCRAKQKADAGFELMQKLGIEYYCPHDVDIIEDCEDIAEYEARMKDITDYLLEKQKETGIKNLWGTANVFGHKRYMNGAATNPQFDIVARAAVQIKNALDATIKLGGTNYVFWGGREGYYTLLNIQMQREKNHLAKMLTAARDYARAKGFKGTFLIEPKPMEPTKHQYDVDTETVIGFIRANGLDKDFKVNIEVNHATLAGHTFEHELTVARENGFLGSIDANRGDAQNGWDTDQFPIDALDLTQAMMQVILNGGEGNGGTNEDAKLRRSSTDPEDTFIAEISAMDAMAHALLNAAATLEESPLPAMVKERYASFDSGLGKKFEEGKASLEELYEYAKKNGEPVAASGKQELCETYLNLYAK 5752MI2_ Prevotella DNA 159ATGACTAAAGAGTATTTCCCGGGAATCGGAAAGATTCCGTTTGAAGGAACCAAGAGCAAG 003AACCCCCTGGCCTTCCATTATTATAACGCCTCCCAGGTAGTGATGGGCAAGCCCATGAAGGACTGGCTCAAGTATGCCATGGCCTGGTGGCACACCCTGGGCCAGGCCTCTGCAGACCCCTTTGGCGGCCAGACCCGCTCCTACGAATGGGACAAGGGCGAGTGCCCGTACTGCCGCGCCAAGCAGAAGGCCGATGCCGGCTTTGAGCTCATGCAGAAGCTGGGCATCGAGTACTACTGCTTCCACGACGTGGACATCATCGAGGACTGCGAGGACATTGCCGAGTACGAGGCCCGCATGAAGGACATCACGGACTACCTGCTGGAGAAGCAGAAAGAGACCGGCATCAAGAACCTCTGGGGCACCGCCAACGTGTTTGGCCACAAGCGCTACATGAACGGCGCCGCCACCAACCCTCAGTTTGACATTGTGGCCCGTGCCGCCGTCCAGATCAAGAACGCCCTGGATGCCACCATCAAACTGGGTGGTACCAACTACGTGTTCTGGGGTGGCCGCGAAGGCTACTACACGCTGCTCAACACCCAGATGCAGCGGGAGAAGAACCACCTGGCCAAGATGCTCACCGCCGCCCGCGACTACGCCCGCGCCAAGGGCTTCAAGGGCACCTTCCTCATTGAGCCCAAACCCATGGAGCCCACCAAGCACCAGTACGACGTGGACACCGAGACCGTGATTGGTTTCATCCGCGCCAACGGCCTGGACAAGGACTTCAAGGTAAACATTGAGGTAAACCACGCCACCCTGGCCGGCCACACCTTTGAGCACGAGCTCACCGTGGCCCGCGAGAACGGCTTCCTGGGCTCCATCGACGCCAACCGCGGAGATGCCCAGAACGGCTGGGATACGGACCAGTTCCCCATCGACGCCCTGGATCTCACCCAGGCTATGATGCAGGTCATCCTCAACGGTGGCTTCGGCAATGGCGGCACCAACTTTGACGCCAAGCTCCGCCGCTCCTCCACCGATCCCGAGGACATCTTCATCGCCCACATCAGCGCCATGGATGCCATGGCACACGCCCTCCTGAACGCAGCCGCCATCCTGGAAGAGAGCCCCCTGCCCGCCATGGTCAAGGAGCGTTACGCTTCCTTCGACAGCGGTCTGGGCAAGAAGTTCGAAGAAGGCAAGGCCTCCCTGGAAGAACTTTACGAATATGCCAAGAAGAATGGAGAGCCCGTGGCCGCTTCCGGCAAACAGGAGCTCTGCGAAACTTACTTGAACCTCTATGCAAAGTAG 5752MI2_Prevotella Amino   160MTKEYFPGIGKIPFEGTKSKNPLAFHTYNASQVVNGKPMKDWLKYANAWWHTLGQASADP 003 AcidEGGQTRSYEWDKGECPYCRAKQKADAGFELMQKLGIEYYCPHDVDIIEDCEDIAEYEARMKDITDYLLEKQKETGIKNLWGTANVFGHKRYMNGAATNPQFDIVARAAVQIKNALDATIKLGGTNYVFWGGREGYYTLLNIQMQREKNHLAKMLTAARDYARAKGFKGTFLIEPKPMEPTKHQYDVDTETVIGFIRANGLDKDFKVNIEVNHATLAGHTFEHELTVARENGFLGSIDANRGDAQNGWDTDQFPIDALDLTQAMMQVILNGGFGNGGTNFDAKLRRSSTDPEDIFTAHISAMDAMAHALLNAAAILEESPLPAMVKERYASFDSGLGKKFEEGKASLEELYETAKKNGEPVAASGKQELCETYLNLYAK 5752MI3_ Prevotella DNA 161ATGGCAAAAGAGTATTTCCCGACTATCGGCAAGATTCCCTTCGAGGGCGTCGAATCCAAG 002AACCCGATGGCATTCCACTACTATGACGCGAACCGCGTCGTGATGGGCAAGCCCATGAAGGACTGGCTCAAGTTCGCGATGGCCTGGTGGCACACCCTGGGACAGGCTTCCGGCGACCCGTTCGGCGGCCAGACCCGTTCCTACGAGTGGGACAAGGGCGAGTGCCCCTACTGCCGCGCCAAGGCCAAGGCCGACGCCGGCTTCGAGATCATGCAGAAGCTCGGTATCGAGTACTACTGCTTCCATGACATCGACCTCGTGGAGGACACCGAGGACATCGCCGAGTACGAGGCCCGCATGAAGGACATCACCGACTACCTCGTCGAGAAGCAGAAGGAAACCGGCATCAAGAACCTCTGGGGCACGGCCAACGTGTTCGGCAACAAGCGCTACATGAACGGCGCCGCCACGAACCCGCAGTTCGACGTCGTCGCCCGCGCCGCCGTCCAGATCAAGAACGCCATCGACGCCACCATCAAGCTCGGCGGTACCGGTTACGTGTTCTGGGGCGGCCGTGAAGGCTACTACACCCTCCTGAACACCCAGATGCAGCGCGAGAAGGACCACCTCGCCAAGATGCTCACCGCCGCCCGCGACTACGCCCGCGCCCACGGCTTCCAGGGCACCTTCCTCATCGAGCCCAAGCCCATGGAGCCCACCAAGCACCAGTACGACGTGGACACGGAGACCGTGATCGGCTTCCTGCGCGCCAACGGTCTGGACAAGGACTTCAAGGTCAATATCGAGGTGAACCACGCCACCCTCGCCGGCCACACCTTCGAGCACGAGCTCACCGTGGCTGTCGATAACGGCTTCCTCGGCTCCATCGACGCCAACCGCGGCGACGCCCAGAACGGCTGGGACACCGACCAGTTCCCCGTGGACCCGTACGACCTCACCCAGGCCATGATGCAGATCATCCGCAACGGCGGTTTCAAGGACGGCGGCACCAACTTCGACGCCAAGCTCCGCCGCTCTTCCACCGACCCGGAGGACATCTTCATCGCCCACATCAGCGCGATGGACGCCATGGCCCACGCCCTGCTGAACGCCGCCGCCGTCATCGAGGAGAGCCCGCTCTGCAAGATGGTCGAGGAGCGCTACGCTTCCTTCGACAGCGGCCTCGGCAAGCAGTTCGAGGAAGGCAAGGCCACCCTCGAGGACCTCTACGAGTATGCCAAGAAGAATGGCGAGCCCGTCGTCGCCTCCGGCAAGCAGGAGCTCTACGAGACGCTGCTGAACCTTTACGCGAAGTAG 5752MI3_Prevotella Amino  162MAKEYFPTIGKIPFEGVESKNPMAFHYYDANRVVMGKPMKDWLKFAMAWWHTLGQASGDP 002 AcidFGGQTRSYEWDKGECPYCRAKAKADAGFEIMQKLGIEYYCFHDIDLVEDTEDIAEYEARMKDITDYLVEKQKETGIKNLWGTANVFGNKRYMNGAATNPQFDVVARAAVQIKNAIDATIKLGGTGYVFWGGREGYYTLLNIQMQREKEHLAKMLTAARDYARAHGFQGTFLIEPKPMEPTKHQYEVETETVIGFLRANGLDKDEKVNIEVNHATLAGHTFEHELTVAVDNGFLGSIDANRGDAQNGWDTDQFPVDPYDLTQAMMQIIRNGGFKDGGTNFDAKLRRSSTDPEDIFIAHISAMDAMAHALLNAAAVIEESPLCKMVEERYASFDSGLGKQFEEGKATLEELYETAKKNGEPVVASGKQELYETLLNLYAK 5752MI5_ Prevotella DNA 163ATGGCAAAAGAGTATTTCCCGACAATCGGTAAGATCCCCTTCGAGGGACCCGAGTCCAAG 003AACCCGATGGCATTCCACTACTATGACGCGGAGCGCGTGGTGATGGGCAAGAAGATGAAGGACTGGTTCAAGTTCGCGATGGCCTGGTGGCACACCCTGGGCCAGGCTTCCGCCGACCCGTTCGGCGGCCAGACCCGCTCCTACGAGTGGGACAAGGGCGAAGGCCCCTGCTCCCGCGCCCGCGCCAAGGCTGACGCCGGTTTCGAGATCATGCAGAAACTGGGCATCGGCTACTACTGCTTCCACGACATCGACCTGGTGGAGGACACCGAGGACATCGCCGAGTATGAAGCCCGCATGAAGGACATCACCGACTACCTCGTGGAGAAGCAGAAGGAGACCGGCATCAAGAACCTCTGGGGCACGGCCAACGTATTCGGCAACAAGCCCTACATGAACGGCGCCGCCACGAACCCGCAGTTCGACATCGCCGCCCGCGCGGCCCTGCAGACCAAGAACGCCATCGATGCCACCATCAAGCTGGGCGGCACCGGTTACGTGTTCTGGGGCGGCCGTGAAGGCTACTACACCCTCCTGAACACCCAGATGCAGCGCGAGAAGGACCACCTTGCCAAGATGCTCACCGCGGCTCGCGACTATGCCCGCGCCCACGGCTTCAAGGGCACCTTCTTCATCGAGCCGAAACCGATGGAGCCCACCAAGCACCAGTACGACGTGGACACGGAGACCGTGATCGGCTTCCTCCGCGCCAACGGCCTGGACAAGGACTTCAAGGTGAACATCGAAGTGAACCACGCCACCCTCGCCGGCCACACCTTCGAGCACGGGCTCACCGTGGCCGTTGACAACGGCTTCCTCGGCAGCATCGACGCCAACCGCGGAGACGCCCAGAACGGCTGGGATACCGACCAGTTCCCGGTGGATCCGTACGACCTCACCCAGGCGATGATCCAGATCATCCGCAATGGCGGCTTCAAGGACGGCGGTACCAACTTCGACGCCAAGCTCCGCCGCTCTTCCACCGACCCGGAGGACATCTTCATCGCCCACATCAGCGCGATGGACGCCATGGCCCACGCCCTGCTGAACGCCGCCGCCGTGCTCGAGGAGAGCCCGCTCTGCGAGATGGTTGCAAAGCGTTACGCTTCCTTCGACAGCGGTCTCGGCAAGAAGTTCGAGGAAGGCAACGCCACCCTCGAGGAACTCTACGAGTACGCCAAGGCGAAGGGCGAGGTCGTTGCCGAATCCGGCAAGCAGGAACTCTACGAGACCCTGCTGAACCTCTACGCGAAGTAG 5752MI5_Prevotella Amino  164MAKEYFPTIGKIPFEGPESKNPMAFHYYDAERVVMGKKMKDWFKFAMAWWHTLGQASADP 003 AcidFGGQTRSYEWDKGEGPCSRARAKADAGFEIMQKLGIGYYCFHDIDLVEDTEDIAEYEARMKDITDYLVEKQKETGIKNLWGTANVFGNKPYMNGAATNPQFDIAARAALQTKNAIDATIKLGGTGYVFWGGREGYYTLLNIQMQREKDHLAKMLTAARDYARAHGFKGTFFIEPKPMEPTKHQYDVDTETVIGFLRANGLDKDFKVNIEVNHATLAGHTFEHGLTVAVDNGFLGSIDANRGDAQNGWDTDQFPVDPYDLTQAMIQIIRNGGFKDGGTNFDAKLRRSSTDPEDIFIAHISAMDAMAHALLNAAAVLEESPLCEMVAKRYASFDSGLGKKFEEGNATLEELYEYAKAKGEVVAESGKQELYETLLNLYAK 5752MI6_ Prevotella DNA 165ATGGCAAAAGAGTATTTCCCGACAATCGGAAAGATCCCCTTCGAGGGCGCTGAGAGCAAG 004AATCCCCTTGCTTTCCACTATTATGACGCCGAGCGTGTGGTCATGGGCAAGCCCATGAAGGACTGGTTCAAGTTCGCGATGGCCTGGTGGCACACCCTGGGCCAGGCTTCCGCCGACCCGTTCGGCGGCCAGACCCGCTCCTACGAGTGGGACAAGGGCGAGTGCCCCTACTGCCGCGCCCGCCAGAAGGCTGACGCCGGTTTCGAGATCATGCAGAAGCTCGGCATCGGCTACTACTGCTTCCACGACATCGACCTGGTCGAGGACACCGAGGACATCGCCGAGTACGAGGCCCGCATGAAGGACATCACCGACTACCTCGTCGAGAAGCAGAAGGAGACCGGCATCAAGAACCTCTGGGGCACGGCCAACGTGTTCGGCAACAAGCGCTACATGAACGGCGCCGCCACGAACCCGCAGTTCGACATCGTCGCCCACGCGGCCCTGCAGATCAAGAACGCGATCGGCGCCACCATCAAGCTCGGCGGCACCGGTTACGTGTTCTGGGGCGGCCGTGAAGGTTACTACACCCTCCTGAACACCCAGATGCAGCGCGAGAAGGACCACCTCGCCAAGATGCTCACCGCCGCCCGCGACTACGCCCGCGCCAACGGCTTCAAGGGCACCTTCCTCATCGAGCCGAAGCCGATGGAGCCCACCAAGCACCAGTATGACGTGGACACGGAGACCGTGATCGGCTTCCTCCGCGCCAACGGCCTGGACAAGGACTTCAAGGTGAACATCGAGGTGAACCACGCCACCCTCGCCGGCCACACCTICGAGCACGAGCTCACCGTGGCGGTCGACAACGGCTTCCTCGGCAGCATCGACGCCAACCGCGGTGACGCCCAGAACGGCTGGGATACCGACCAGTTCCCGGTGGATCCGTACGATCTCACCCAGGCGATGATCCAGATCATCCGCAACGGCGGCTTCAAGGATGGCGGCACCAACTTCGACGCCAAGCTCCGCCGCTCTTCCACCGACCCGGAGGACATCTTCATCGCCCACATCAGCGCGATGGACGCCATGGCCCACGCCCTGCTGAACGCCGCCGCCGTCATCGAGGAGAGCCCGCTCTGCGAGATGGTCGCCAAGCGCTACGCTTCCTTCGACAGCGGTCTCGGCAAGAAGTTCGAGGAAGGCAACGCCACCCTCGAGGAACTCTACGAGTACGCCAAGGCGAACGGTGAGGTCAAGGCCGAATCCGGCAAGCAGGAGCTCTACGAGACCCTTCTGAACCTCTACGCGAAATAG 5752MI6_Prevotella Amino  166MAKEYFPTIGKIPFEGAESKNPLAFHYYDAERVVMGKPMKDWFKFAMAWWHTLGQASADP 004 AcidFGGQTRSYEWDKGECPYCRARQKADAGFEIMQKLGIGYYCFHDIDLVEDTEDIAEYEARMKDITDYLVEKQKETGIKNLWGTANVFGNKRYMNGAATNPQFDIVAHAALQIKNAIGATIKLGGTGYVFWGGREGYYTLLNTQMQREKDHLAKMLTAARDYARANGFKGTFLIEPKPMEPTKHQYDVDTETVIGFLRANGLDKDFKVNIEVNHATLAGHTFEHELTVAVDNGFLGSIDANRGDAQNGWDTDQFPVDPYDLTQAMIQIIRNGGFKDGGTNFDAKLRRSSTDPEDIFIAHISAMDAMAHALLNAAAVIEESPLCEMVAKRYASFDSGLGKKFEEGNATLEELYEYAKANGEVKAESGKQELYETLLNLYAK 5753MI1_ Prevotella DNA 167ATGGCAAAAGAGTATTTCCCCACTATCGGGAAGATTCCTTTCGAAGGAGTCGAGAGCAAG 002AACCCCCTTGCATTCCATTATTATGACGCAAACCGCATGGTCATGGGCAAGCCCATGAAGGACTGGTTCAAGTTCGCCATGGCATGGTGGCACACCCTGGGACAGGCCTCCGCAGACCCGTTCGGCGGCCAGACCCGCTCCTACGAATGGGACAAGGGCGAATGCCCCTACTGCCGCGCCAGGGCAAAGGCCGATGCCGGCTTCGAGATCATGCAGAAACTGGGTATCGAGTATTTCTGCTTCCATGACATCGACCTGGTAGAGGACTGCGACGACATCGCCGAGTACGAGGCCCGCATGAAGGACATCACGGACTATCTCCTGGAGAAGATGAAGGAAACCGGCATCAAGAACCTCTGGGGCACCGCCAACGTGTTCGGCAACAAGCGTTACATGAACGGCGCCGGCACCAATCCGCAGTTCGACGTAGTGGCCCGCGCTGCCGTCCAGATCAAGAACGCCATCGACGCCACCATCAAGCTCGGCGGTTCCAACTATGTGTTCTGGGGCGGCCGTGAAGGATACTACACCCTGCTGAACACCCAGATGCAGCGCGAGAAGGACCACCTCGGCAAACTGCTCACCGCCGCCCGCGACTATGCCCGCAAGAACGGCTTCAAGGGCACCTTCCTCATCGAGCCCAAGCCGATGGAGCCCACCAAGCACCAGTACGACGTAGACACGGAGACCGTGATCGGCTTCCTCCGCGCCAACGGCCTGGAGAAAGACTTCAAGGTGAACATCGAGGTGAACCACGCCACCCTGGCCGGCCATACCTTCGAGCATGAACTCACCGTGGCCTTGGACAACGGCTTCCTGGGATCCATCGACGCCAACCGCGGCGACGCCCAGAACGGCTGGGATACGGACCAGTTCCCGGTAGACCCGTACGACCTCACCCAGGCCATGATGCAGATCATCCGCAACGGCGGCCTCGGCAACGGCGGTACCAACTTCGACGCCAAACTGCGCCGTTCCTCCACCGATCCTGAGGACATCTTCATCGCCCACATCAGCGCCATGGACGCCATGGCCCACGCCCTGCTCAACGCAGCCGCCGTGCTGGAAGAAAGTCCGCTCTGTGAGATGGTCAAGGAGCGCTACGCTTCCTTCGACAGCGGTCTCGGCAAGAAGTTCGAAGAGGGCAAGGCTACCCTGGAAGAAATCTACGAGTATGCCAAGAAGAGCGGCGAACCCGTGGTCGCTTCCGGCAAGCAGGAGCTCTACGAAACCCTGCTGAACCTCTACGCCAAGTAG 5753MI1_Prevotella Amino  168MAKEYFPTIGKIPFEGVESKNPLAFHYYDANRMVMGKPMKDWFKFAMAWWHTLGQASADP 002 AcidFGGQTRSYEWDKGECPYCRARAKADAGFEIMQKLGIEYFCPHDIDLVEDCDDIAEYEARMKDITDYLLEKMKETGIKNLWGTANVFGNKRYMNGAGTNPQFDVVARAAVQIKNATDATIKLGGSNYVFWGGREGYYTLLNTQMQREKDHLGKLLTAARDYARKNGFKGTFLIEPKPMEPTKHQYEVETETVIGFLRANGLEKDEKVNIEVNHATLAGHTFEHELTVAVDNGFLGSIDANRGDAQNGWDTDQFPVDPYDLTQAMMQIIRNGGLGNGGTNFDAKLRRSSTDPEDIFTAHISAMDAMAHALLNAAAVLEESPLCEMVKERYASFDSGLGKKFEEGKATLEEIYEYAKKSGEPVVASGKQELYETLLNLYAK 5753MI2_ Prevotella DNA 169ATGGCTAAAGAATACTTCCCCTCCATCGGCAAAATCCCTTTTGAAGGAGGCGACAGCAAA 002AATCCCCTCGCTTTCCATTATTATGACGCCGGACGCGTGGTTATGGGCAAGCCCATGAAGGAATGGCTTAAATTCGCCATGGCCTGGTGGCACACGCTGGGCCAGGCCTCCGGAGACCCCTTCGGCGGCCAGACCCGCAGCTACGAATGGGACAAGGGCGAATGCCCCTACTGCCGCGCCAAAGCCAAGGCCGACGCCGGTTTTGAAATCATGCAAAAGCTGGGTATCGAATACTTCTGCTTCCACGATGTGGACCTTATCGAGGATTGCGATGACATTGCCGAATACGAAGCCCGCATGAAGGACATCACGGACTACCTGCTGGAAAAGATGAAGGAGACCGGCATCAAGAACCTCTGGGGCACCGCCAATGTCTTCGGCCACAAGCGCTACATGAACGGCGCCGCCACGAACCCGCAGTTCGACGTGGTCGCCCGCGCCGCCGTCCAGATCAAGAACGCGATTGACGCCACCATCAAGCTCGGCGGTACCAGTTATGTATTCTGGGGCGGCCGCGAGGGCTACTACACCCTCCTGAACACCCAGATGCAGCGTGAGAAAGACCACCTGGCCAAGATGCTCACCGCAGCCCGCGACTACGCCCGCGCCAAGGGCTTCAAGGGCACCTTCCTCATCGAGCCCAAGCCGATGGAGCCCACCAAGCACCAGTACGACGTTGACACGGAGACCGTGATCGGCTCCCTGCGCGCCAACGGCCTGGACAAGGACTTCAAGGTGAACATCGAGGTGAACCACGCCACCCTGGCCGGCCACACCTTCGAGCACGAACTCACCGTGGCTGTTGACAACGGCTTCCTGGGCTCCATCGACGCCAACCGCGGCGACGCCCAGAACGGCTGGGATACGGACCAGTTCCCGGTAGACCCGTACGACCTCACCCAGGCCATGATGCAGATTATCCGCAACGGCGGCTTCAAGGACGGCGGCACCAACTTCGATGCCAAACTGCGCCGCTCTTCCACCGATCCGGAAGACATCTTCATCGCCCACATCAGCGCTATGGATGCCATGGCACACGCCCTGCTCAACGCCGCCGCCGTGCTGGAAGAGAGCCCGCTGTGCAACATGGTCAAGGAGCGTTACGCCGGCTTCGACAGCGGCCTTGGCAAGAAGTTCGAGGAAGGGAAGGCAACGCTGGAGGAAATCTATGACTATGCCAAGAAGAGCGGCGAACCCGTCGTGGCTTCCGGCAAGCAGGAACTCTACGAAACCATCCTGAACCTCTATGCCAAGTAG 5753MI2_Prevotella Amino  170MAKEYFPSIGKIPFEGGDSKNPLAFHYYDAGRVVMGKPMKEWLKFAMAWWHTLGQASGDP 002 AcidFGGQTRSYEWDKGECPYCRAKAKADAGFEIMQKLGIEYFCFHDVDLIEDCDDIAEYEARMKDITDYLLEKMKETGIKNLWGTANVFGHKRYMNGAATNPQFDVVARAAVQIKNAIDATIKLGGTSYVFWGGREGYYTLLNIQMQREKDHLAKMLTAARDYARAKGFKGTFLIEPKPMEPTKHQYDVDTETVIGSLRANGLDKDFKVNIEVNHATLAGHTFEHELTVAVDNGFLGSIDANRGDAQNGWDTDQFPVDPYDLTQAMMQIIRNGGFKDGGTNFDAKLRRSSTDPEDIFIAHISAMDAMAHALLNAAAVLEESPLCNMVKERYAGFDSGLGKKFEEGKATLEEIYDYAKKSGEPVVASGKQELYETILNLYAK 5753MI4_ Prevotella DNA 171ATGTCAAAAGAGTATTTCCCTACAATCGGCAGGGTCCCCTTCGAGGGACCTGAGAGCAAG 002AATCCGCTGGCGTTCCACTATTACGAGCCGGACCGGCTCGTCCTGGGCAGGAAAATGAAGGACTGGCTGCGCTTCGCAATGGCCTGGTGGCATACGCTCGGGCAGGCTTCCGGCGACCAGTTCGGCGGACAGACCTGCACATACGCCTGGGATGAAGGCGAGTGTCCCGTCTGCCGGGCAAAGGCCAAGGCTGACGCCGGCTTTGAACTGATGCAGAAACTGGGCATCGGGTATTTCTGCTTCCACGACGTGGACCTGGTCGAGGAGGCCGACACCATTGAAGAATACGAGGAGCGGATGCGGATCATCACCGACTACCTGCTCGAGAAGATGGAAGAGACCGGCATCCGCAATCTCTGGGGAACCGCCAATGTCTTCGGACACAAGCGCTATATGAACGGCGCCGCCACCAATCCCGACTTCGACGTCGTGGCCCGTGCCGCGGTCCAGATCAAGAATGCCATCGATGCCACCATCAAACTGGGTGGTGAGAACTATGTGTTCTGGGGTGGCCGCGAGGGCTATACGAGCCTGCTCAACACGCAGATGCACCGGGAAAAACACCACCTCGGAAATATGCTCAGGGCAGCCCGCGACTATGGCCGTGCCCACGGTTTCAAGGGAACGTTCCTGATCGAGCCCAAGCCGATGGAGCCGACCAAGCATCAGTACGACCAGGATACGGAGACGGTCATCGGTTTCCTGCGCTGTCACGGCCTGGACAAGGATTTCAAGGTGAACATCGAGGTGAACCACGCCACGCTCGCCGGACACACCTTCGAGCACGAACTGGCCACTGCGGTCGATGCCGGCCTGCTGGGCAGCATCGATGCCAACCGCGGCGACGCCCAGAACGGCTGGGATACCGACCAGTTCCCGATCGACAACTACGAACTCACGCTGGCGATGCTGCAGATCATCCGCAATGGCGGACTCGCACCCGGCGGATCGAACTTCGATGCCAAGTTGCGCCGCAATTCCACCGATCCGGAAGACATCTTCATCGCCCACATCAGCGCGATGGACGCGATGGCCCGTGCCCTGCTCAATGCGGCGGCCATCTGGACCGAATCGCCGATTCAGGATATGGTCAGGGACCGCTATGCTTCCTTCGACAGCGGAAAGGGCAGGGAGTTCGAGGAAGGCAGACTCAGTCTGGAAGACCTCGTGGCCTATGCGAAGGAGCACGGTGAGCCGCGCCAGATCTCCGGCAGGCAGGAACTTTATGAAACCATCGTAGCGCTTTACTGCAGGTAA 5753MI4_Prevotella Amino  172MSKEYFPTIGRVPFEGPESKNPLAFHYYEPDRLVLGRKMKEWLRFAMAWWHTLGQASGDQ 002 AcidFGGQTCTYAWDEGECPVCRAKAKADAGFELMQKLGIGYFCFHDVDLVEEADTIEEYEERMRIITDYLLEKMEETGIRNLWGTANVEGHKRYMNGAATNPDFDVVARAAVQIKNAIDATIKLGGENYVFWGGREGYTSLLNTQMHREKHHLGNMLRAARDYGRAHGFKGTFLIEPKPMEPTKHQYDQDTETVIGFLRCHGLDKDFKVNIEVNHATLAGHTFEHELATAVDAGLLGSIDANRGDAQNGWDTDQFPIDNYELTLAMLQIIRNGGLAPGGSNEDAKLRRNSTDPEDTFIANISAMDAMARALLNAAAIWTESPIQDMVRDRYASFDSGKGREFEEGRLSLEDLVAYAKEHGEPRQISGRQELYETIVALYCR 5752MI4_ Prevotella DNA 173ATGACTAAAGAGTATTTCCCGGGAATCGGAACGATTCCGTTTGAAGGAACCAAGAGCAAG 004AACCCCCTGGCCTTCCATTATTATAACGCCTCCCAGGTAGTGATGGGCAAGCCCATGAAGGACTGGCTCAAGTATGCCATGGCCTGGTGGCACACCCTGGGCCAGGCCTCTGCAGACCCCTTTGGCGGCCAGACCCGCTCCTACGAATGGGACAAGGGCGAGTGCCCGTACTGCCGCGCCAAGCAGAAGGCCGATGCCGGCTTTGAGCTCATGCAGAAGCTGGGCATCGAGTACTACTGCTTCCACGACGTGGACATCATCGAGGACTGCGAGGACATTGCCGAGTACGAGGCCCGCATGAAGGACATCACGGACTACCTGCTGGAGAAGCAGAAAGAGACCGGCATCAAGAACCTCTGGGGCACCGCCAACGTGTTTGGCCACAAGCGCTACATGAACGGCGCCGCCACCAACCCTCAGTTTGACATTGTGGCCCGTGCCGCCGTCCAGATCAAGAACGCCCTGGATGCCGCCATCAAACTGGGTGGTACCAACTACGTGTTCTGGGGTGGCCGCGAAGGCTACTACACGCTGCTCAACACCCAGATGCAGCGGGAGAAGAACCACCTGGCCAAGATGCTCACCGCCGCCCGCGACTACGCCCGCGCCAAGGGCTTCAAGGGCACCTTCCTCATTGAGCCCAAACCCATGGAGCCCACCAAGCACCAGTACGACGTGGACACCGAGACCGTGATTGGTTTCATCCGCGCCAACGGCCTGGACAAGGACTTCAAGGTAAACATTGAGGTAAACCACGCCACCCTGGCCGGCCACACCTTTGAGCACGAGCTCACCGTGGCCCGCGAGAACGGCTTCCTGGGCTCCATCGACGCCAACCGCGGAGATGCCCAGAACGGCTGGGATACGGACCAGTTCCCCATCGACGCCCTGGATCTCACCCAGGCTATGATGCAGGTCATCCTCAACGGTGGCTTCGGCAATGGCGGCACCAACTTTGACGCCAAGCTCCGCCGCTCCTCCACCGATCCCGAGGACATCTTCATCGCCCACATCAGCGCCATGGATGCCATGGCACACGCCCTCCTGAACGCAGCCGCCATCCTGGAAGAGAGCCCCCTGCCCGCCATGGTCAAGGAGCGTTACGCTTCCTTCGACAGCGGTCTGGGCAAGAAGTTCGAAGAAGGCAAGGCCTCCCTGGAAGAACTTTACGAATATGCCAAGAAGAATGGAGAGCCCGTGGCCGCTTCCGGCAAACAGGAGCTCTGCGAAACTTACTTGAACCTCTATGCAAAGTAG 5752MI4_Prevotella Amino  174MTKEYFPGIGTIPFEGTKSKNPLAFHYYNASQVVMGKPMKDWLKYAMAWWHTLGQASADP 004 AcidFGGQTRSYEWDKGECPYCRAKQKADAGFELMQKLGIEYYCFHDVDIIEDCEDIAEYEARMKDITDYLLEKQKETGIKNLWGTANVEGHKRYMNGAATNPQFDIVARAAVQIKNALDAAIKLGGTNYVFWGGREGYYTLLNTQMQREKNHLAKMLTAARDYARAKGFKGTFLIEPKPMEPTKHQYDVDTETVIGFIRANGLEKDEKVNIEVNHATLAGHTFEHELTVARENGFLGSIDANRGDAQNGWDTDQFPIDALDLTQAMMQVILNGGEGNGGTNEDAKLRRSSTDPEDIFIAHISAMDAMAHALLNAAATLEESPLPAMVKERYASFDSGLGKKFEEGKASLEELYEYAKKNGEPVAASGKQELCETYLNLYAK 727MI4_ Rhizobiales DNA 175GTGACTGATTTCTTCAAGGGCATCGCGCCCGTCAAGTTTGAGGGGCCGCAGAGCTCCAAT 006CCGCTGGCCTATCGCCACTATAACAAGGACGAAATCGTCCTCGGCAAGCGGATGGAAGACCATATCCGTCCCGGCGTTGCCTATTGGCACACCTTCGCCTATGAGGGCGGCGATCCGTTTGGCGGCCGCACCTTCGATCGCCCCTGGTTCGACAAGGGTATGGACGGCGCCCGCCTCAAGGCCGACGTGGCCTTCGAACTGTTCGACCTGCTCGACGTTCCTTTCTTCTGTTTCCACGATGCTGATATCGCTCCCGAAGGCGCAACGCTGGCCGAGAGCAACCGCAATGTGCGCGAGATTGGCGAGATCTTCGCTCGCAAGATGGAAACCAGCCGCACCAAGCTGCTCTGGGGTACGGCAAACCTGTTCTCCAATCGCCGCTACATGGCCGGCGCCGCCACCAACCCGGACCCGGAAATCTTCGCCTATGCCGCTGGGCAGGTGAAGAACGTGCTGGAACTGACCCACGAACTGGGCGGCGCCAACTATGTGCTGTGGGGCGGTCGCGAGGGTTATGAAACCCTGCTCAACACCAAGATCGGCCAGGAAATGGACCAGATGGGCCGTTTTCTGTCGATGGTCGTCGAGCATGCCGAAAAGATCGGCTTCAAGGGCCAGATCCTGATCGAGCCCAAGCCGCAGGAGCCGAGCAAGCACCAGTATGACTTCGACGTTGCAACCGTTTACGGCTTCCTCAAGAAGTATGGTCTCGAAACCAAGGTGAAGTGCAATATCGAGGTCGGCCATGCCTTCCTCGCCAATCACTCCTTCGAGCATGAACTGGCTTTGGCCGCATCGCTGGGCATTCTCGGCTCGGTCGACGCCAATCGCAACGATCTACAGTCCGGCTGGGATACCGACCAGTTCCCCAATAATGTCCCCGAAACCGCACTCGCCTTCTATCAGATTCTCAAGGCGGGCGGACTGGGCAATGGCGGCTGGAACTTCGACGCCCGCGTGCGCCGCCAGTCACTTGATCCGGCCGACCTGCTGCACGGCCATATCGGCGGCCTCGACGTGCTGGCGCGCGGCCTCAAGGCCGCCGCGGCGCTGATCGAGGACGGCACCTATGACAAGGTCGTCGACGCCCGCTATGCCGGCTGGAACCAGGGCCTGGGCAAGGATATCCTTGGTGGCAAGCTGAACCTTGCCGACCTGGCTGCCAAGGTCGACGCCGAAAACCTCAACCCGCAGCCTAGGTCCGGCCAGCAGGAATATCTCGAAAACCTGATCAACCGGTTCGTTTAG 727MI4_ RhizobialesAmino  176 MTDFFKGIAPVKFEGPQSSNPLAYRHYNKDEIVLGKRMEDHIRPGVAYWHTFAYEGGDPF006 Acid GGRTFDRPWFDKGMDGARLKADVAFELFDLLDVPFFCFHDADIAPEGATLAESNRNVREIGEIFARKMETSRTKLLWGTANLFSNRRYMAGAATNPDPEIFAYAAGQVKNVLELTHELGGANYVLWGGREGYETLLNTKIGQEMPQMGRFLSMVVEHAEKIGFKGQILIEPKPQEPSKHQYDFDVATVYGFLKKYGLETKVKCNIEVGHAFLANHSFEHELALAASLGILGSVDANRNDLQSGWDTDQFPNNVPETALAFYQILKAGGLGNGGWNFDARVRRQSLDPADLLHGHIGGLDVLARGLKAAAALIEDGTYDKVVDARYAGWNQGLGKDILGGKLNLADLAAKVDAENLNPQPRSGQQEYLENLINRFV

5.5 Example 4: Quantification of XI Enzyme Activity

The clones identified in the ABD and SBD screens (see Table 2) weresubcloned into vector p426PGK1 (FIG. 3), a modified version of p426GPD(ATCC accession number 87361) in which the GPD promoter was replacedwith the PGK1 promoter from Saccharomyces cerevisiae (ATCC accessionnumber 204501) gDNA. The clones were then transformed into yeast strainMYA11008.

Cells were grown as described in the materials and methods. Cell pelletswere resuspended in about 300 μl of lysis buffer: approximateconcentrations (50 mM NaH₂PO₄ (pH 8.0), 300 mM NaCl, 10 mM imidazole(Sigma, #I5513), to which was added about 2 μl/ml beta-mercaptoethanol(BME)), and protease inhibitor cocktail tablet (Roche, 11836170001) (1tablet for about 10 ml cell extract). The cell suspension was added to a2 ml screw-cap microcentrifuge tube that had been pre-aliquotted withabout 0.5 ml of acid washed glass beads (425-600 μm). Cells were lysedusing a FastPrep-24 (MP Biomedicals, Solon, Ohio) at amplitude settingof about 6 for about 3 repetitions of about 1 minute. Cells were chilledon ice for about 5 minutes between repetitions. Samples were centrifugedat about 10,000×g for about 10 minutes at 4° C. Recovered supernatantswere used in the XI enzyme activity assay. XI enzyme activity wasperformed as described in the materials and methods. Results are shownin Table 3.

TABLE 3 XI activity at pH 7.5 SEQ ID NO: Volumetric Activity FIOPC 2−60.73 2.58 4 −21.84 0.93 6 0.86 −0.05 8 −2.14 0.12 10 −2.38 0.13 12−12.82 0.54 14 −26.97 1.45 16 −76.50 4.12 18 −15.32 0.83 20 −5.33 0.2922 0.48 −0.03 24 0.36 −0.02 26 0.81 −0.04 28 −6.65 0.36 30 −9.10 0.49 32−38.10 2.05 34 −21.76 1.17 36 −13.82 0.59 38 −17.58 0.75 40 −12.34 0.5242 −74.88 3.18 44 −37.10 1.57 46 −35.57 1.51 48 −24.69 1.05 50 −32.231.37 52 −26.72 1.13 54 −90.79 3.85 56 −39.89 1.69 58 −74.26 3.15 60−11.91 0.64 62 −15.43 0.83 64 −12.98 0.70 66 −27.45 1.48 68 −29.43 1.5970 −4.54 0.24 72 −8.93 0.48 74 −0.20 0.01 76 −0.33 0.02 78 −50.55 2.1580 −57.13 2.42 82 −58.09 2.47 84 −46.42 1.97 86 −35.95 1.53 88 −2.160.09 90 −32.77 1.39 92 −30.82 1.31 94 −8.16 0.35 96 −46.18 1.96 98−30.05 1.28 100 −8.40 0.45 102 −8.34 0.45 104 −3.80 0.20 106 −4.81 0.26108 −12.06 0.65 110 −6.10 0.33 112 −7.71 0.42 114 −4.17 0.22 116 −7.070.38 118 −13.50 0.73 120 −1.15 0.06 122 0.03 0.00 124 −4.41 0.24 126−0.85 0.05 128 −14.60 0.79 130 −17.26 0.93 132 −0.75 0.04 134 −11.550.62 136 −7.20 0.39 138 0.16 −0.01 140 −3.63 0.20 142 −3.63 0.20 144−1.20 0.06 146 −16.77 0.90 148 −2.00 0.11 150 −1.40 0.08 152 −3.63 0.20154 −7.09 0.38 156 −0.96 0.05 158 −2.79 0.15 160 −3.23 0.17 162 −10.170.55 164 −0.51 0.03 166 −3.43 0.19 168 −5.65 0.30 170 −2.35 0.13 172−1.20 0.06 174 −2.29 0.12 176 −1.92 0.08 Op-XI (ABD) −23.56 NA Op-XI(SBD) −18.55 NA Vo—ctrl −1.74 NA

5.6 Example 5: Growth of Yeast Containing XI Clones on Xylose

A subset of the XI genes from Example 3 were expressed in Saccharomycescerevisiae CEN.PK2-1Ca (ATCC: MYA1108) and assayed for ability to conferthe ability to grow on xylose. This assay was carried out as follows:colonies were isolated on SC-ura+2% glucose agar plates and inoculatedinto about 3 ml “pre-cultures” of both SC-ura 2% glycerol and SC-ura 2%xylose media, incubated at about 30° C., about 220 rpm, overnight. Cellswere harvested by centrifugation (about 100×g, 5 minutes), supernatantdiscarded and washed twice and resuspended in about 1 ml of SC-ura 2%xylose. Cells were inoculated into Biolector plates, containing SC-ura,2% xylose, and inoculums were normalized to two different startingoptical densities of about OD₆₀₀ 0.2 and 0.4. Plates were covered usinggas permeable seals and incubated in a BioLector microfermentationdevice (m2p-labs, Model G-BL100) at about 30° C. for about 4 days at 800rpm and 90% humidity. Growth readings from the Biolector were acquiredfor 60-100 hours according to manufacturer's recommendations. Resultsare shown in FIG. 4.

5.7 Example 6: Ethanol Production Under Anaerobic Conditions

A subset of the XI expressing yeast clones in strain Saccharomycescerevisiae CEN.PK2-1Ca (ATCC: MYA1108) were assayed for ability toferment xylose to ethanol (EtOH). In brief, single colonies wereinoculated into about 25 ml of SC-ura medium supplemented with about0.1% glucose and about 3% xylose. Cultures were incubated undermicroaerobic conditions at about 30° C. and about 200 rpm. Samples wereharvested at about 0, 24, 48, 72 h, and ethanol concentration determinedvia HPLC standard assays. Ethanol productivity was calculated, andlisted in units of grams of EtOH per liter per hour, and FIOPC wasgenerated comparing productivity of the control Op-XI. Results are shownin Table 4.

TABLE 4 Anaerobic EtOH Production Time (h) SEQ ID NO: 0 24 48 72 EtOH(g/L/h) FIOPC 6 0.28 0 0 0 −0.004 −0.5 8 0 0 0 0 0.000 0.0 10 0 0 0 00.000 0.0 14 0.37 0.28 0.71 1.24 0.013 1.7 16 0.33 0.275 0.72 1.06 0.0111.4 18 0.29 0.135 0.31 0.595 0.005 0.6 20 0.33 0 0 0 −0.004 −0.5 22 0.320 0 0 −0.004 −0.5 24 0.28 0 0 0 −0.004 −0.5 26 0.26 0 0 0 −0.003 −0.4 280.23 0.385 1.015 1.54 0.019 2.5 30 0.27 0 0 0.07 −0.003 −0.3 32 0 0.1650.48 0.815 0.012 1.5 34 0 0.125 0.33 0.615 0.009 1.1 36 0 0 0 0 0.0000.0 46 0 0.285 0.905 1.625 0.023 3.0 60 0.45 0.35 0.87 1.39 0.014 1.8 620 0 0 0.065 0.001 0.1 64 0.38 0.275 0.735 1.18 0.012 1.6 66 0 0 0.120.22 0.003 0.4 68 0 0.05 0.275 0.5 0.007 0.9 70 0 0 0 0 0.000 0.0 720.119 0 0.054 0.1685 0.001 0.1 74 0.21 0.11 0.275 0.57 0.005 0.7 76 0.280 0 0 −0.004 −0.5 90 0 0.24 0.69 1.09 0.016 2.0 100 0.104 0.642 0.1410.366 0.001 0.2 102 0.185 0 0 0.054 −0.002 −0.2 104 0.235 0.536 0 0−0.005 −0.7 106 0.188 0.4835 0 0 −0.004 −0.6 108 0.19 0.5855 0.14550.313 0.000 0.0 110 0.3 0 0 0.05 −0.003 −0.4 112 0.19 0.5535 0.1060.1135 −0.003 −0.4 114 0.174 0 0 0 −0.002 −0.3 116 0.15 0 0.0515 0.2110.001 0.1 118 0.177 0.7075 0.5065 0.941 0.009 1.1 120 0.153 0 0 0 −0.002−0.2 122 0.169 0.553 0 0.074 −0.003 −0.5 124 0.125 0 0 0 −0.002 −0.2 1260.32 0 0 0 −0.004 −0.5 128 0 0 0 0 0.000 0.0 130 0 0 0 0 0.000 0.0 1320.121 0 0 0 −0.002 −0.2 134 0.118 0 0 0.1105 0.000 0.0 136 0.108 0 0 0−0.001 −0.2 138 0.172 0.513 0 0 −0.004 −0.6 140 0.17 0.542 0 0.31350.000 −0.1 142 0.102 0 0 0 −0.001 −0.2 144 0.28 0 0 0 −0.004 −0.5 1460.103 0.635 0.263 0.563 0.004 0.5 150 0.27 0 0 0 −0.003 −0.4 149 0.27 00 0 −0.003 −0.4 152 0.17 0 0 0 −0.002 −0.3 154 0.23 0 0 0 −0.003 −0.4156 0.23 0 0 0 −0.003 −0.4 158 0.4 0 0.105 0.23 −0.002 −0.2 160 0.38 0 00 −0.005 −0.6 162 0.36 0.055 0.23 0.41 0.001 0.2 164 0.32 0 0 0 −0.004−0.5 166 0.31 0 0 0 −0.004 −0.5 168 0.32 0 0.295 0.6 0.005 0.6 170 0.1640.4995 0 0 −0.004 −0.5 172 0.27 0 0 0 −0.003 −0.4 174 0.3 0 0.17 0.3450.001 0.2 OP-XI (pos) 0.2385 0.5875 0.6965 0.81508 0.008 NA Host-(neg)0.23625 0.088125 0 0 −0.003 NA

5.8 Example 7: Impact of pH on XI Activity

Extracts from strain Saccharomyces cerevisiae CEN.PK2-1Ca (ATCC:MYA1108, expressing XI gene candidates in vector p426PGK1, were preparedas described in the Materials and Methods and assayed for XI activity atpH 7.5 and pH 6.0. Percent activity listed was calculated by dividingthe VA at pH 6 by the VA at pH 7.5 and multiplying by 100. Results arelisted in Table 5.

TABLE 5 XI activity at pH 6 and pH 7.5 Percent Organism VA, pH 6 VA, pH7.5 activity SEQ ID NO: Classification (U/ml) (U/ml) (pH 6) 2Bacteroidales 1.92 2.59 74% 14 Bacteroides 0.32 0.98 32% 16 Bacteroides1.16 2.40 48% 32 Bacteroides 1.17 2.21 53% 38 Firmicutes 2.46 2.77 89%42 Firmicutes 1.71 2.18 79% 44 Firmicutes 0.19 0.25 76% 46 Firmicutes1.49 1.95 76% 50 Firmicutes 0.81 0.95 86% 52 Firmicutes 0.02 0.08 26% 54Neocallimastigales 1.46 2.90 51% 58 Neocallimastigales 1.89 3.05 62% 68Neocallimastigales 1.50 1.97 76% 72 Neocallimastigales 0.57 1.04 55% 78Prevotella 2.40 3.61 67% 80 Prevotella 1.52 2.29 66% 82 Prevotella 1.481.65 89% 84 Prevotella 1.79 2.96 61% 96 Prevotella 2.13 3.56 60% 116Prevotella 0.06 0.13 47% Host-neg 0.04 0.02 NA Op-XI 0.61 1.25 49%

5.9 Example 8: K_(m) for Selected XI Clones

The K_(m) and V_(max) at pH 6 were determined for a subset of the XIclones, expressed on p426PGK1 vector in Saccharomyces cerevisiaeCEN.PK2-1Ca (ATCC: MYA1108), using the XI activity assay described inthe Materials and Methods and varying the concentrations of xylose fromabout 40-600 mM. Results shown are calculated using the Hanes Plot,which rearranges the Michaelis-Menten equation(v=V_(max)[S]/(K_(m)+[S])) as: ([S]/v=K_(m)/V_(max)+[S]/V_(max)), whereplotting [S]/v against [S], resulting in a straight line and where the yintercept=K_(m)/V_(max), the slope=1/V_(max), and the xintercept=−K_(m). Results are listed in Table 6.

TABLE 6 K_(m) determination for 3 XIs SEQ ID NO: K_(m) V_(max) 78 35.227.6 96 33.7 28.0 38 28.8 28.6

5.10 Example 9: Quantification of XI Activity Expressed from SingleGenomic Integration Locus

A vector named pYDAB006 (FIG. 5A) for integration into locus YER131.5(between YER131W and YER132C) in the S. cerevisiae genome wasconstructed using conventional cloning methods. The vector backbone witha Pad site at each end was derived from pBluescript II SK (+) (AgilentTechnologies, Inc. Santa Clara, Calif.) by standard PCR techniques,which contained only the pUC origin of replication and bla gene encodingampicillin resistance protein as a selectable marker. Two 300-base pairsegments named YER131.5-A and YER 131.5-B were amplified from yeastgenomic DNA by standard PCR techniques and connected with a multiplecloning site (MCS 1:5′-GGCGCGCCTCTAGAAAGCTTACGCGTGAGCTCCCTGCAGGGATATCGGTACCGCGGCCGC-3′ (SEQID NO:181)) using the overlapping PCR technique. The PCR primers used inthe overlapping PCR are shown in Table 7 below:

TABLE 7 Primers Used in pYDAB006 Construction SEQ ID Primer NO:Sequence (PacI site is underlined) 131.5AF 182caccattaattaaAGCTTTGTAAATATGATGAGAGAATAATATA AATCAAACG 131.5AR 183GGCGCGCCTCTAGAAAGCTTAATCGACAAGAACACTTCT ATTTATATAGGTATGAAA 131.5BF 184GCAGGGATATCGGTACCCACCAGCGGCCGCTGAAGAAG GTTTATTTCGTTTCGCTGT 131.5BR 185caccattaattaaCCCAGGTGAGACTGGATGCTCCATA ABMCSF 186GCCTCTAGAAAGCTTACGCGTGAGCTCCCTGCAGGGATA TCGGTACCCACCAGCGGCCGC ABMCSR 187CGCTGGTGGGTACCGATATCCCTGCAGGGAGCTCACGCG TAAGCTTTCTAGAGGCGCGCC

The overlapping PCR product was then ligated with the vector backboneresulting in plasmid pYDAB006.

A vector named pYDURA01 (FIG. 5B) for generating yeast selectable andrecyclable marker was constructed using similar method as pYDAB006. TheURA3 expression cassette was amplified from yeast genomic DNA bystandard PCR techniques. The 200 base pair fingerprint sequence (namedR88: TGCGTGTGCCGCGAGTCCACGTCTACTCGCGAACCGAGTGCAGGCGGGTCTTCGGCCAGGACGGCCGTGCGTGACCCCGGCCGCCAGACGAAACGGACCGCGCTCGCCAGACGCTACCCAGCCCGTTCATGCCGGCCGCGAGCCGACCTGTCTCGGTCGCTTCGACGCACGCGCGGTCCTTTCGGGTACTCGCCTAAGAC (SEQ ID NO:188)) at both sides of URA3 cassette wasamplified by standard PCR techniques from the genomic DNA of yBPA317,which was a diploid strain having genotypes MATa/MATalpha; URA3/ura3;YDL074.5::P(TDH3)-CBT1-T(CYC1)-R88YLR388.5::P(TDH3)-StBGL-T(CYC1)-R88/YLR388.5::P(TDH3)-StBGL-T(CYC1)-R88.The primers used in the amplification are described in Table 8 below:

TABLE 8 Primers Used in pYDURA01 Construction SEQ ID Sequence Primer NO:(KpnI and NotI sites are underlined) NotI-KpnI-R88-F 189caatagcggccgcggtaccTGCGTGTGCCGCGAGTCCAC R88-BamHI-R 190TGTTAGGATCCGTCTTAGGCGAGTACCCGAAAGG BamHI-ura-F 191caataggatccAGGCATATTTATGGTGAAGAATAAGT ura-Xho-R 192TGTTACTCGAGAAATCATTACGACCGAGATTCCCG XhoI-R88-F 193caatactcgagTGCGTGTGCCGCGAGTCCAC R88-NotI-R 194TGTTAGCGGCCGCGTCTTAGGCGAGTACCCGAAAGG

An expression cassette was generated for the XI genes by cloning into avector named pYDPt005 (FIG. 5C). pYDPt005 was generated using similarmethod as pYDAB006. It contained a TDH3 promoter and a PGK1 terminatorflanking a multiple cloning site

(SEQ ID NO: 195) (MCS 2: 5′- ACTAGTGGATC

GTCGAC

-3′,where single underline is SpeI site, double underline is XhoI site, andjagged underline is PmeI site). The promoter and the terminator wereamplified from S. cerevisiae genomic DNA; an AscI site was added to the5′ end of the TDH3 promoter while a KpnI site was added to the 3′ end ofthe PGK1 terminator during amplification. Primers used in theamplification are described in Table 9.

TABLE 9 Primers Used in pYDPt005 Construction SEQ ID Sequence Primer NO:(AscI and KpnI sites are underlined) TDH-F 196CACCAGGCGCGCCTCTAGAAAGCTTACGCGTAGTTTATC ATTATCAATACTGCCATTTCAAAGAoverlap-TDH-R 197 AACGTCGACCTCGAGGGATCCACTAGTTCGAAACTAAGTTCTTGGTGTTTTAAAACT overlap-PGK-F 198GTGGATCCCTCGAGGTCGACGTTTAAACATTGAATTGAA TTGAAATCGATAGATCAAT PGK-R 199CACCAGCGGCCGCGGTACCGATATCCCTGCAGGGAGCTC GAAATATCGAATGGGAAAAAAAAACTGGAT

An Orpinomyces sp. XI gene (NCBI:169733248) was cloned in this vectorbetween the SpeI and XhoI sites. The Orpinomyces sp. XI expressioncassette and R88-Ura-R88 fragment were then cloned into vector pYDAB006using AscI, KpnI and NotI sites; the resulting plasmid was namedpYDABF006 (FIG. 5D). Subsequently, the Orpinomyces sp. XI gene inpYDABF0006 was replaced with a subset of the XI genes of Table 2 bydigestion of pYDABF0006 with SpeI and PmeI and ligation to a DNAfragment encoding the appropriate XI sequence which had been amplifiedfrom p426PGK1-XI constructs. A SpeI site followed by a Kozak-likesequence (6 consecutive adenines) was added immediately in front of thestart codon of the XI genes while a PmeI site was added to the 3′ end ofthe XI genes during amplification.

XI gene integration cassettes were extracted by PacI digestion and usedto transform yeast strain yBPA130 using standard techniques.Transformants were selected for growth on SC-Ura (Synthetic Complete,Ura dropout) agar plates. Integration position and existence of XIcassette in transformants was confirmed by PCR using the primers shownin Table 10.

TABLE 10 Primers Used in Integration Verification SEQ ID Primer NO:Sequence 5′ of integration 200 ACAGGGATAACAAAGTTTCTCCAGC 3′of integration 201 CATACCAAGTCATGCGTTACCAGAG 5′ of R88-ura-R88 202TTTCCCATTCGATATTTCGAGCTCC 3′ of integration 203CATACCAAGTCATGCGTTACCAGAG

Confirmed clones were then grown about 18 hours in liquid YPD to allowlooping out of the URA3 marker and were selected for growth on SC+5-FOAagar plate. The absence of the URA3 marker was confirmed by PCR.

Strains containing the confirmed XI expression cassettes were inoculatedinto about 3 ml of modified YP Media (YP+0.1% Glucose+3.0% Xylose) andincubated overnight at about 30° C. and about 220 rpm. These overnightcultures were subcultured into about 25 ml of the same media to aboutOD₆₀₀=0.2. Samples were incubated overnight at about 30° C. and about220 rpm. Cultures were harvested when OD₆₀₀ was between about 3 and 4.Pellets were collected by centrifugation for about 5 minutes at about4000 rpm. The supernatant was discarded and pellets washed with about 25ml of distilled-deionized water and centrifuged again using the sameconditions. Supernatant was discarded and the pellet frozen at about−20° C. until lysis and characterization.

Cell pellets were thawed and about 200 mg of each pellet sample wasweighed out into 2 ml microcentrifuge tubes. About 50 μl of Complete®,EDTA-free Protease Inhibitor cocktail (Roche Part#11873 580 001) at 5times the concentration stated in the manufacturer's protocol was addedto each sample. To this was added about 0.5 ml of Y-PER Plus® DialyzableYeast Protein Extraction Reagent (Thermo Scientific Part#78999) (YP+) toeach sample. Samples were incubated at about 25° C. for about 4 hours onrotating mixer. Sample supernatants were collected after centrifugationat about 10,000×g for about 10 minutes for characterization.

Total protein concentrations of the XI sample extracts prepared abovewere carried out using Bio-Rad Protein Assay Dye Reagent Concentrate(Bio-Rad, cat#500-0006, Hercules Calif.) which is a modified version ofthe Bradford method (Bradford).

Yeast physiological pH ranges are known to range from about pH 6 toabout pH 7.5 (Pena, Ramirez et al., 1995, J. Bacteriology 4:1017-1022).Ranking of XI activity at yeast physiological pH was accomplished usingthe assay conditions at pH 7.5 and modified for pH 6.0 as described inthe materials and methods. The specific activities of 20 XIs whenexpressed from a single copy integrated into the yeast YER131.5 locuswere evaluated. The results are listed in Table 11.

TABLE 11 SA of XI Expressed in an Industrial S. cerevisiae Organism SA,pH6 SA, pH 7.5 SEQ ID NO: Classification (U/mg) (U/mg) 2 Bacteroidales0.86 1.08 14 Bacteroides 0.33 1.07 16 Bacteroides 0.57 1.05 32Bacteroides 0.53 1.00 38 Firmicutes 1.00 0.94 42 Firmicutes 0.79 0.82 44Firmicutes 0.08 0.10 46 Firmicutes 0.62 0.69 50 Firmicutes 0.35 0.41 52Firmicutes 0.01 0.03 54 Neocallimastigales 0.64 1.17 58Neocallimastigales 0.79 1.10 68 Neocallimastigales 0.01 0.02 72Neocallimastigales 0.22 0.40 78 Prevotella 1.10 1.45 80 Prevotella 0.741.11 82 Prevotella 0.54 0.60 84 Prevotella 0.76 1.06 96 Prevotella 1.101.62 116 Prevotella 0.03 0.06 Host neg ctrl 0.00 0.02

5.11 Example 10: Identification of Sequence Motifs in Acid Tolerant XIs

The proposed mechanism of xylose isomerases can be summarized asfollows: (i) binding of xylose to xylose isomerase, so that O3 and O4are coordinated by metal ion I; (ii) enzyme-catalyzed ring opening (theidentity of the ring-opening group remains a subject for furtherinvestigation; ring opening may be the rate limiting step in the overallisomerization process); (iii) chain extension (sugar binds in a linearextended form) in which O2 and O4 now coordinate metal ion I; (iv) O2becomes deprotonated causing a shift of metal ion II from position 1 toan adjacent position 2 in which it coordinates O1 and O2 of the sugartogether with metal ion I; (v) isomerization via an anionic transitionstate arises by a hydride shift promoted by electrophilic catalysisprovided by both metal ions; (vi) collapse of transition state by returnof metal ion II to position 1; (vii) chain contraction to apseudo-cyclic position with ligands to metal ion I changing from O2/O4back to O3/O4; (viii) enzyme-catalyzed ring closure; (ix) dissociationof xylulose from xylose isomerase (Lavie et al., 1994, Biochemistry33(18), 5469-5480).

Many XIs identified contained one or both of two signature sequencescharacteristic of XIs, [LI]EPKP.{2}P (SEQ ID NO:204) and[FL]HD[^K]D[LIV].[PD].[GDE] (SEQ ID NO:205). Additional sequence motifspresent in the top performing Firmicutes and Prevotella XIs wereidentified. The motifs are located near the active site includingresidues in direct contact with the D-xylose and/or the metal ions. Themotifs are shown in Table 12 below:

TABLE 12 XI Sequence Motifs SEQ ID XI Source Motif Sequence NO:Firmicutes 1A P[FY][AST][MLVI][AS][WYFL]W[HT]N[LFMG]GA 206 Firmicutes 1BP[FY][AS].{2}[WYFL]W[HT][{circumflex over ( )}TV].GA 207 Firmicutes 2[GSN][IVA]R[YFHG][FYLIV]C[FW]HD.D 208 Firmicutes 3T[ASTC][NK][{circumflex over ( )}L]F.[NDH][PRKAG][RVA][FY]C 209Firmicutes 4 [WFY]D[TQVI]D.[FY][PF][{circumflex over( )}T].{2, 4}[YFH]S[ATL]T 210 Firmicutes 5 GF[NH]FD[SA]KTR 211Prevotella 1A FG.QT[RK].{2}E[WYF][DNG].{2, 3}[DNEGT][AT] 212 Prevotella1B FG.QT[RK].{2}E[WYF][DNG].{3}[{circumflex over ( )}C][{circumflex over( )}P] 213 Prevotella 2 [FW]HD.D[LVI].[DE]EG[{circumflex over( )}P][TSD][IV][EA]E 214

5.12 Example 11: In Vivo Evaluation of Xylose Isomerase

Haploid S. cerevisiae strain yBPA130 (MATa::ura3) and yBPA136(MATalpha::ura3) were genetically modified to enhance C5 xyloseutilization during fermentation. The modification includes thefollowing: the native glucose repressible alcohol dehydrogenase II geneADH2 was disrupted by inserting an expression cassette of the endogenoustransaldolase gene TAL1 (SEQ ID NO:215) and xylulokinase gene XKS1 (SEQID NO:216). PHO13 encoding the native alkaline phosphatase specific forp-nitrophenyl phosphate gene was disrupted by inserting the nativetransketolase-1 gene TKL1 (SEQ ID NO:217). Native aldose reductase geneGRE3 was disrupted by inserting native D-ribulose-5-phosphate3-epimerase gene RPE1 (SEQ ID NO:218) and Ribose-5-phosphateketol-isomerase gene RKI1 (SEQ ID NO:219). Also one expression cassetteof native galactose permease gene GAL2 (SEQ ID NO:220) was integratedinto the S. cerevisiae strain, resulting in haploid strains pBPB007(MATa::ura3) and pBPB008 (MATalpha::ura3). The genotype of pBPB007 andpBPB008 is adh2::TA1-XKS1, pho13::TKL1-XKS1, gre3::RPE1-RKI1 andYLR388.5::GAL2. The sequences are shown in Table 13, below:

TABLE 13 SEQ Sequence Type of ID Name sequence NO: Sequence TAL1 (S. DNA215 ATGTCTGAACCAGCTCAAAAGAAACAAAAGGTTGCTAACAACTCT cerevisiae)CTAGAACAATTGAAAGCCTCCGGCACTGTCGTTGTTGCCGACACTGGTGATTTCGGCTCTATTGCCAAGTTTCAACCTCAAGACTCCACAACTAACCCATCATTGATCTTGGCTGCTGCCAAGCAACCAACTTACGCCAAGTTGATCGATGTTGCCGTGGAATACGGTAAGAAGCATGGTAAGACCACCGAAGAACAAGTCGAAAATGCTGTGGACAGATTGTTAGTCGAATTCGGTAAGGAGATCTTAAAGATTGTTCCAGGCAGAGTCTCCACCGAAGTTGATGCTAGATTGTCTTTTGACACTCAAGCTACCATTGAAAAGGCTAGACATATCATTAAATTGTTTGAACAAGAAGGTGTCTCCAAGGAAAGAGTCCTTATTAAAATTGCTTCCACTTGGGAAGGTATTCAAGCTGCCAAAGAATTGGAAGAAAAGGACGGTATCCACTGTAATTTGACTCTATTATTCTCCTTCGTTCAAGCAGTTGCCTGTGCCGAGGCCCAAGTTACTTTGATTTCCCCATTTGTTGGTAGAATTCTAGACTGGTACAAATCCAGCACTGGTAAAGATTACAAGGGTGAAGCCGACCCAGGTGTTATTTCCGTCAAGAAAATCTACAACTACTACAAGAAGTACGGTTACAAGACTATTGTTATGGGTGCTTCTTTCAGAAGCACTGACGAAATCAAAAACTTGGCTGGTGTTGACTATCTAACAATTTCTCCAGCTTTATTGGACAAGTTGATGAACAGTACTGAACCTTTCCCAAGAGTTTTGGACCCTGTCTCCGCTAAGAAGGAAGCCGGCGACAAGATTTCTTACATCAGCGACGAATCTAAATTCAGATTCGACTTGAATGAAGACGCTATGGCCACTGAAAAATTGTCCGAAGGTATCAGAAAATTCTCTGCCGATATTGTTACTCTATTCGACTTGATTGAA AAGAAAGTTACCGCTTAAXKS1 (S. DNA 216 ATGTTGTGTTCAGTAATTCAGAGACAGACAAGAGAGGTTTCCAACcerevisiae) ACAATGTCTTTAGACTCATACTATCTTGGGTTTGATCTTTCGACCCAACAACTGAAATGTCTCGCCATTAACCAGGACCTAAAAATTGTCCATTCAGAAACAGTGGAATTTGAAAAGGATCTTCCGCATTATCACACAAAGAAGGGTGTCTATATACACGGCGACACTATCGAATGTCCCGTAGCCATGTGGTTAGAGGCTCTAGATCTGGTTCTCTCGAAATATCGCGAGGCTAAATTTCCATTGAACAAAGTTATGGCCGTCTCAGGGTCCTGCCAGCAGCACGGGTCTGTCTACTGGTCCTCCCAAGCCGAATCTCTGTTAGAGCAATTGAATAAGAAACCGGAAAAAGATTTATTGCACTACGTGAGCTCTGTAGCATTTGCAAGGCAAACCGCCCCCAATTGGCAAGACCACAGTACTGCAAAGCAATGTCAAGAGTTTGAAGAGTGCATAGGTGGGCCTGAAAAAATGGCTCAATTAACAGGGTCCAGAGCCCATTTTAGATTTACTGGTCCTCAAATTCTGAAAATTGCACAATTAGAACCAGAAGCTTACGAAAAAACAAAGACCATTTCTTTAGTGTCTAATTTTTTGACTTCTATCTTAGTGGGCCATCTTGTTGAATTAGAGGAGGCAGATGCCTGTGGTATGAACCTTTATGATATACGTGAAAGAAAATTCAGTGATGAGCTACTACATCTAATTGATAGTTCTTCTAAGGATAAAACTATCAGACAAAAATTAATGAGAGCACCCATGAAAAATTTGATAGCGGGTACCATCTGTAAATATTTTATTGAGAAGTACGGTTTCAATACAAACTGCAAGGTCTCTCCCATGACTGGGGATAATTTAGCCACTATATGTTCTTTACCCCTGCGGAAGAATGACGTTCTCGTTTCCCTAGGAACAAGTACTACAGTTCTTCTGGTCACCGATAAGTATCACCCCTCTCCGAACTATCATCTTTTCATTCATCCAACTCTGCCAAACCATTATATGGGTATGATTTGTTATTGTAATGGTTCTTTGGCAAGGGAGAGGATAAGAGACGAGTTAAACAAAGAACGGGAAAATAATTATGAGAAGACTAACGATTGGACTCTTTTTAATCAAGCTGTGCTAGATGACTCAGAAAGTAGTGAAAATGAATTAGGTGTATATTTTCCTCTGGGGGAGATCGTTCCTAGCGTAAAAGCCATAAACAAAAGGGTTATCTTCAATCCAAAAACGGGTATGATTGAAAGAGAGGTGGCCAAGTTCAAAGACAAGAGGCACGATGCCAAAAATATTGTAGAATCACAGGCTTTAAGTTGCAGGGTAAGAATATCTCCCCTGCTTTCGGATTCAAACGCAAGCTCACAACAGAGACTGAACGAAGATACAATCGTGAAGTTTGATTACGATGAATCTCCGCTGCGGGACTACCTAAATAAAAGGCCAGAAAGGACTTTTTTTGTAGGTGGGGCTTCTAAAAACGATGCTATTGTGAAGAAGTTTGCTCAAGTCATTGGTGCTACAAAGGGTAATTTTAGGCTAGAAACACCAAACTCATGTGCCCTTGGTGGTTGTTATAAGGCCATGTGGTCATTGTTATATGACTCTAATAAAATTGCAGTTCCTTTTGATAAATTTCTGAATGACAATTTTCCATGGCATGTAATGGAAAGCATATCCGATGTGGATAATGAAAATTGGGATCGCTATAATTCCAAGATTGTCCCCTTAAGCGAACTG GAAAAGACTCTCATCTAATKL1 (S. DNA 217 ATGACTCAATTCACTGACATTGATAAGCTAGCCGTCTCCACCATAcerevisiae) AGAATTTTGGCTGTGGACACCGTATCCAAGGCCAACTCAGGTCACCCAGGTGCTCCATTGGGTATGGCACCAGCTGCACACGTTCTATGGAGTCAAATGCGCATGAACCCAACCAACCCAGACTGGATCAACAGAGATAGATTTGTCTTGTCTAACGGTCACGCGGTCGCTTTGTTGTATTCTATGCTACATTTGACTGGTTACGATCTGTCTATTGAAGACTTGAAACAGTTCAGACAGTTGGGTTCCAGAACACCAGGTCATCCTGAATTTGAGTTGCCAGGTGTTGAAGTTACTACCGGTCCATTAGGTCAAGGTATCTCCAACGCTGTTGGTATGGCCATGGCTCAAGCTAACCTGGCTGCCACTTACAACAAGCCGGGCTTTACCTTGTCTGACAACTACACCTATGTTTTCTTGGGTGACGGTTGTTTGCAAGAAGGTATTTCTTCAGAAGCTTCCTCCTTGGCTGGTCATTTGAAATTGGGTAACTTGATTGCCATCTACGATGACAACAAGATCACTATCGATGGTGCTACCAGTATCTCATTCGATGAAGATGTTGCTAAGAGATACGAAGCCTACGGTTGGGAAGTTTTGTACGTAGAAAATGGTAACGAAGATCTAGCCGGTATTGCCAAGGCTATTGCTCAAGCTAAGTTATCCAAGGACAAACCAACTTTGATCAAAATGACCACAACCATTGGTTACGGTTCCTTGCATGCCGGCTCTCACTCTGTGCACGGTGCCCCATTGAAAGCAGATGATGTTAAACAACTAAAGAGCAAATTCGGTTTCAACCCAGACAAGTCCTTTGTTGTTCCACAAGAAGTTTACGACCACTACCAAAAGACAATTTTAAAGCCAGGTGTCGAAGCCAACAACAAGTGGAACAAGTTGTTCAGCGAATACCAAAAGAAATTCCCAGAATTAGGTGCTGAATTGGCTAGAAGATTGAGCGGCCAACTACCCGCAAATTGGGAATCTAAGTTGCCAACTTACACCGCCAAGGACTCTGCCGTGGCCACTAGAAAATTATCAGAAACTGTTCTTGAGGATGTTTACAATCAATTGCCAGAGTTGATTGGTGGTTCTGCCGATTTAACACCTTCTAACTTGACCAGATGGAAGGAAGCCCTTGACTTCCAACCTCCTTCTTCCGGTTCAGGTAACTACTCTGGTAGATACATTAGGTACGGTATTAGAGAACACGCTATGGGTGCCATAATGAACGGTATTTCAGCTTTCGGTGCCAACTACAAACCATACGGTGGTACTTTCTTGAACTtCGTTTCTTATGCTGCTGGTGCCGTTAGATTGTCCGCTTTGTCTGGCCACCCAGTTATTTGGGTTGCTACACATGACTCTATCGGTGTCGGTGAAGATGGTCCAACACATCAACCTATTGAAACTTTAGCACACTTCAGATCCCTACCAAACATTCAAGTTTGGAGACCAGCTGATGGTAACGAAGTTTCTGCCGCCTACAAGAACTCTTTAGAATCCAAGCATACTCCAAGTATCATTGCTTTGTCCAGACAAAACTTGCCACAATTGGAAGGTAGCTCTATTGAAAGCGCTTCTAAGGGTGGTTACGTACTACAAGATGTTGCTAACCCAGATATTATTTTAGTGGCTACTGGTTCCGAAGTGTCTTTGAGTGTTGAAGCTGCTAAGACTTTGGCCGCAAAGAACATCAAGGCTCGTGTTGTTTCTCTACCAGATTTCTTCACTTTTGACAAACAACCCCTAGAATACAGACTATCAGTCTTACCAGACAACGTTCCAATCATGTCTGTTGAAGTTTTGGCTACCACATGTTGGGGCAAATACGCTCATCAATCCTTCGGTATTGACAGATTTGGTGCCTCCGGTAAGGCACCAGAAGTCTTCAAGTTCTTCGGTTTCACCCCAGAAGGTGTTGCTGAAAGAGCTCAAAAGACCATTGCATTCTATAAGGGTGACAAGCTAATTTCTCCTTTGAAAAAAGCTTTCTAA RPE1 (S. DNA 218ATGGTCAAACCAATTATAGCTCCCAGTATCCTTGCTTCTGACTTC cerevisiae)GCCAACTTGGGTTGCGAATGTCATAAGGTCATCAACGCCGGCGCAGATTGGTTACATATCGATGTCATGGACGGCCATTTTGTTCCAAACATTACTCTGGGCCAACCAATTGTTACCTCCCTACGTCGTTCTGTGCCACGCCCTGGCGATGCTAGCAACACAGAAAAGAAGCCCACTGCGTTCTTCGATTGTCACATGATGGTTGAAAATCCTGAAAAATGGGTCGACGATTTTGCTAAATGTGGTGCTGACCAATTTACGTTCCACTACGAGGCCACACAAGACCCTTTGCATTTAGTTAAGTTGATTAAGTCTAAGGGCATCAAAGCTGCATGCGCCATCAAACCTGGTACTTCTGTTGACGTTTTATTTGAACTAGCTCCTCATTTGGATATGGCTCTTGTTATGACTGTGGAACCTGGGTTTGGAGGCCAAAAATTCATGGAAGACATGATGCCAAAAGTGGAAACTTTGAGAGCCAAGTTCCCCCATTTGAATATCCAAGTCGATGGTGGTTTGGGCAAGGAGACCATCCCGAAAGCCGCCAAAGCCGGTGCCAACGTTATTGTCGCTGGTACCAGTGTTTTCACTGCAGCTGACCCGCACGATGTTATCTCCTTCATGAAAGAAGAAGTCTCGAAGGAATTGCGTTCTAGAGATTTGCTAGATTAG RKI1 (S. DNA 219ATGGCTGCCGGTGTCCCAAAAATTGATGCGTTAGAATCTTTGGGC cerevisiae)AATCCTTTGGAGGATGCCAAGAGAGCTGCAGCATACAGAGCAGTTGATGAAAATTTAAAATTTGATGATCACAAAATTATTGGAATTGGTAGTGGTAGCACAGTGGTTTATGTTGCCGAAAGAATTGGACAATATTTGCATGACCCTAAATTTTATGAAGTAGCGTCTAAATTCATTTGCATTCCAACAGGATTCCAATCAAGAAACTTGATTTTGGATAACAAGTTGCAATTAGGCTCCATTGAACAGTATCCTCGCATTGATATAGCGTTTGACGGTGCTGATGAAGTGGATGAGAATTTACAATTAATTAAAGGTGGTGGTGCTTGTCTATTTCAAGAAAAATTGGTTAGTACTAGTGCTAAAACCTTCATTGTCGTTGCTGATTCAAGAAAAAAGTCACCAAAACATTTAGGTAAGAACTGGAGGCAAGGTGTTCCCATTGAAATTGTACCTTCCTCATACGTGAGGGTCAAGAATGATCTATTAGAACAATTGCATGCTGAAAAAGTTGACATCAGACAAGGAGGTTCTGCTAAAGCAGGTCCTGTTGTAACTGACAATAATAACTTCATTATCGATGCGGATTTCGGTGAAATTTCCGATCCAAGAAAATTGCATAGAGAAATCAAACTGTTAGTGGGCGTGGTGGAAACAGGTTTATTCATCGACAACGCTTCAAAAGCCTACTTCGGTAATTCTGACGGTAGTGTTGAAGTT ACCGAAAAGTGA GAL2 (S. DNA220 ATGGCAGTTGAGGAGAACAATATGCCTGTTGTTTCACAGCAACCC cerevisiae)CAAGCTGGTGAAGACGTGATCTCTTCACTCAGTAAAGATTCCCATTTAAGCGCACAATCTCAAAAGTATTCTAATGATGAATTGAAAGCCGGTGAGTCAGGGTCTGAAGGCTCCCAAAGTGTTCCTATAGAGATACCCAAGAAGCCCATGTCTGAATATGTTACCGTTTCCTTGCTTTGTTTGTGTGTTGCCTTCGGCGGCTTCATGTTTGGCTGGGATACCGGTACTATTTCTGGGTTTGTTGTCCAAACAGACTTTTTGAGAAGGTTTGGTATGAAACATAAGGATGGTACCCACTATTTGTCAAACGTCAGAACAGGTTTAATCGTCGCCATTTTCAATATTGGCTGTGCCTTTGGTGGTATTATACTTTCCAAAGGTGGAGATATGTATGGCCGTAAAAAGGGTCTTTCGATTGTCGTCTCGGTTTATATAGTTGGTATTATCATTCAAATTGCCTCTATCAACAAGTGGTACCAATATTTCATTGGTAGAATCATATCTGGTTTGGGTGTCGGCGGCATCGCCGTCTTATGTCCTATGTTGATCTCTGAAATTGCTCCAAAGCACTTGAGAGGCACACTAGTTTCTTGTTATCAGCTGATGATTACTGCAGGTATCTTTTTGGGCTACTGTACTAATTACGGTACAAAGAGCTATTCGAACTCAGTTCAATGGAGAGTTCCATTAGGGCTATGTTTCGCTTGGTCATTATTTATGATTGGCGCTTTGACGTTAGTTCCTGAATCCCCACGTTATTTATGTGAGGTGAATAAGGTAGAAGACGCCAAGCGTTCCATTGCTAAGTCTAACAAGGTGTCACCAGAGGATCCTGCCGTCCAGGCAGAGTTAGATCTGATCATGGCCGGTATAGAAGCTGAAAAACTGGCTGGCAATGCGTCCTGGGGGGAATTATTTTCCACCAAGACCAAAGTATTTCAACGTTTGTTGATGGGTGTGTTTGTTCAAATGTTCCAACAATTAACCGGTAACAATTATTTTTTCTACTACGGTACCGTTATTTTCAAGTCAGTTGGCCTGGATGATTCCTTTGAAACATCCATTGTCATTGGTGTAGTCAACTTTGCCTCCACTTTCTTTAGTTTGTGGACTGTCGAAAACTTGGGACATCGTAAATGTTTACTTTTGGGCGCTGCCACTATGATGGCTTGTATGGTCATCTACGCCTCTGTTGGTGTTACTAGATTATATCCTCACGGTAAAAGCCAGCCATCTTCTAAAGGTGCCGGTAACTGTATGATTGTCTTTACCTGTTTTTATATTTTCTGTTATGCCACAACCTGGGCGCCAGTTGCCTGGGTCATCACAGCAGAATCATTCCCACTGAGAGTCAAGTCGAAATGTATGGCGTTGGCCTCTGCTTCCAATTGGGTATGGGGGTTCTTGATTGCATTTTTCACCCCATTCATCACATCTGCCATTAACTTCTACTACGGTTATGTCTTCATGGGCTGTTTGGTTGCCATGTTTTTTTATGTCTTTTTCTTTGTTCCAGAAACTAAAGGCCTATCGTTAGAAGAAATTCAAGAATTATGGGAAGAAGGTGTTTTACCTTGGAAATCTGAAGGCTGGATTCCTTCATCCAGAAGAGGTAATAATTACGATTTAGAGGATTTACAACATGACGACAAACCGTGGTACAAGGCCATGCTAGAATAA

A vector named pYDAB008 rDNA (FIG. 6) for integration xylose isomeraseinto ribosomal DNA loci in S. cerevisiae genome was constructed usingconventional cloning methods. This vector can confer high copy numberintegration of genes and resulting in high-level expression of proteins.The vector was derived from pBluescript II SK (+) (Agilent Technologies,Inc., Santa Clara, Calif.). The pUC origin of replication and bla geneencoding ampicillin resistance was amplified with specific primersequences as a selectable marker for cloning. A 741 base-pair segment R1region, 253 base-pair R3 region and a 874 base-pair R2 region wereamplified from yeast genomic DNA by PCR amplifications. A multiplecloning site of SEQ ID NO:181 (:5′-GGCGCGCCTCTAGAAAGCTTACGCGTGAGCTCCCTGCAGGGATATCGGTACCGCGGCCGC-3′) wasinserted between the R1 and R3/R2 regions by assembly using overlappingPCR. All primers used in above reactions are shown in Table 14.Overlapping PCR products were then ligated in one reaction and result inrDNA integration plasmid named pYDAB008 rDNA (FIG. 6).

TABLE 14 Primers Used in pYDAB008 rDNA vector construction SEQ IDSequence Primer NO: (PacI restriction site is underlined)PacI-rDNA(R1)-R 221 CACCATTAATTAACCCGGGGCACCTGTCACTTTGGAArDNA (R1)-over-R 222 CGCGTAAGCTTTCTAGAGGCGCGCCAAGCTTTTACACTCTTGACCAGCGCA AB vector-MCS-R 223 CCGCTGGTGGGTACCGATATCCCTGCAGGGAGCTCACGCGTAAGCTTTCTAGAGGCG rDNA(R3)-over-R 224CTGCAGGGATATCGGTACCCACCAGCGGCCGCAGGCCTTGG GTGCTTGCTGGCGAArDNA(R3)-over-R 225 ACCTCTGCATGCGAATTCTTAAGACAAATAAAATTTATAGAG ACTTGTrDNA(R2)-over-R 226 GTCTTAAGAATTCGCATGCAGAGGTAGTTTCAAGGT PacI-rDNA(R2)-R227 CACCATTAATTAATACGTATTTCTCGCCGAGAAAAACTT

pYDABF 0015 (a plasmid comprising a Boles codon optimized nucleic acidof SEQ ID NO:244, encoding a xylose isomerase of SEQ ID NO:78) andpYDABF-0026 (a plasmid comprising a Boles codon optimized nucleic acidof SEQ ID NO:245, encoding a xylose isomerase of SEQ ID NO: SEQ IDNO:96) (both described in Example 10) were digested with Asc I and Kpn Irestriction enzymes (New England Biolabs Inc., MA, USA) and theXI-coding insert ligated to pYDAB008 rDNA integration vector describedabove (FIG. 6). The resulting plasmids were named pYDABF-0033 (SEQ IDNO:78) and pYDABF-0036 (SEQ ID NO:96). Additionally, Boles codonoptimized nucleic acids encoding xylose isomerase of SEQ ID NO:54 andSEQ ID NO:58 (SEQ ID NO:238 and SEQ ID NO:239, respectively) wereordered from Genewiz (Genewiz Inc., NJ, USA) were digested with Asc Iand Kpn I restriction enzymes (New England Biolabs Inc., MA, USA) andligated to pYDAB008 rDNA integration vector (FIG. 6). Thecodon-optimized sequences are set forth in Table 15, below:

TABLE 15 SEQ Sequence ID Description NO: Sequence Codon optimized DNA238 ATGGCTAAGGAATACTTCCCAGAAATTGGTAAGATTAAGTTCGAA encoding XI of SEQGGTAAGGACTCTAAGAACCCAATGGCTTTCCACTACTACGACCCA ID NO: 54GAAAAGGTTATTATGGGTAAGCCAATGAAGGACTGGTTGAGATTCGCTATGGCTTGGTGGCACACCTTGTGTGCTGAAGGTGGTGACCAATTCGGTGGTGGTACTAAGAAGTTCCCATGGAACAACGGTGCTGACGCTGTTGAAATTGCTAAGCAAAAGGCTGACGCTGGTTTCGAAATTATGCAAAAGTTGGGTATTCCATACTTCTGTTTCCACGACGTTGACTTGGTTTCTGAAGGTGCTTCTGTTGAAGAATACGAAGCTAACTTGAAGGCTATTACCGACTACTTGGCTGTTAAGATGAAGGAAACCGGAATTAAGTTGTTGTGGTCTACCGCTAACGTTTTCGGTAACGGTAGATACATGAACGGTGCTTCTACCAACCCAGACTTCGACGTTGTTGCTAGAGCTATTGTTCAAATTAAGAACGCTATTGACGCTGGTATTAAGTTGGGTGCTGAAAACTACGTTTTCTGGGGTGGTAGAGAAGGTTACATGTCTTTGTTGAACACCGACCAAAAGAGAGAAAAGGAACACATGGCTACCATGTTGACCATGGCTAGAGACTACGCTAGAGCTAAGGGTTTCAAGGGTACTTTCTTGATTGAACCAAAGCCAATGGAACCATCTAAGCACCAATACGACGTTGACACCGAAACCGTTATTGGTTTCTTGAAGGCTCACAACTTGGACAAGGACTTCAAGGTTAACATTGAAGTTAACCACGCTACCTTGGCTGGTCACACCTTCGAACACGAATTGGCTGTTGCTGTTGACAACAACATGTTGGGTTCTATTGACGCTAACAGAGGTGACTACCAAAACGGTTGGGACACCGACCAATTCCCAATTGACCAATACGAATTGGTTCAAGCTTGGATGGAAATTATTAGAGGTGGTGGTTTGGGTACAGGTGGTACTAACTTCGACGCTAAGACCAGAAGAAACTCTACCGACTTGGAAGACATTTTCATTGCTCACATTGCTGGTATGGACGCTATGGCTAGAGCTTTGGAATCTGCTGCTAAGTTGTTGGAAGAATCTCCATACAAGGCTATGAAGGCTGCTAGATACGCTTCTTTCGACAACGGTATTGGTAAGGACTTCGAAGACGGTAAGTTGACCTTGGAACAAGCTTACGAATACGGTAAGAAGGTTGGTGAACCAAAGCAAACCTCTGGTAAGCAAGAATTGTACGAAGCTATTGTTGCTATG TACGCTTAACodon optimized DNA 239 ATGGCTAAGGAATACTTCCCAGAAATTGGTAAGATTAAGTTCGAAencoding XI of SEQ GGTAAGGACTCTAAGAACCCAATGGCTTTCCACTACTACGACGCTID NO: 58 GAAAAGGTTATTATGGGTAAGCCAATGAAGGAATGGTTGAGATTCGCTATGGCTTGGTGGCACACCTTGTGTGCTGAAGGTGGTGACCAATTCGGTGGTGGTACTAAGAAGTTCCCATGGAACGAAGGTACTGACGCTGTTACCATTGCTAAGCAAAAGGCTGACGCTGGTTTCGAAATTATGCAAAAGTTGGGTTTCCCATACTTCTGTTTCCACGACATTGACTTGGTTTCTGAAGGTAACTCTATTGAAGAATACGAAGCTAACTTGCAAGCTATTACCGACTACTTGAAGGTTAAGATGGAAGAAACCGGAATTAAGTTGTTGTGGTCTACCGCTAACGTTTTCGGTAACGGTAGATACATGAACGGTGCTTCTACCAACCCAGACTTCGACGTTGTTGCTAGAGCTATTGTTCAAATTAAGAACGCTATTGACGCTGGTATTAAGTTGGGTGCTGAAAACTACGTTTTCTGGGGTGGTAGAGAAGGTTACATGTCTTTGTTGAACACCGACCAAAAGAGAGAAAAGGAACACATGGCTACCATGTTGACCATGGCTAGAGACTACGCTAGATCTAAGGGTTTCAAGGGTACTTTCTTGATTGAACCAAAGCCAATGGAACCATCTAAGCACCAATACGACGTTGACACCGAAACCGTTATTGGTTTCTTGAAGGCTCACAACTTGGACAAGGACTTCAAGGTTAACATTGAAGTTAACCACGCTACCTTGGCTGGTCACACCTTCGAACACGAATTGGCTGTTGCTGTTGACAACGGTATGTTGGGTTCTATTGACGCTAACAGAGGTGACTACCAAAACGGTTGGGACACCGACCAATTCCCAATTGACCAATACGAATTGGTTCAAGCTTGGATGGAAATTATTAGAGGTGGTGGTTTGGGTACTGGTGGTACAAACTTCGACGCTAAGACCAGAAGAAACTCTACCGACTTGGAAGACATTTTCATTGCTCACATTTCTGGTATGGACGCTATGGCTAGAGCTTTGGAATCTGCTGCTAAGTTGTTGGAAGAATCTCCATACTGTGCTATGAAGAAGGCTAGATACGCTTCTTTCGACTCTGGTATTGGTAAGGACTTCGAAGACGGTAAGTTGACCTTGGAACAAGCTTACGAATACGGTAAGAAGGTTGGTGAACCAAAGCAAACCTCTGGTAAGCAAGAATTGTACGAAGCTATTGTTGCTATG TACGCTTAACodon optimized DNA 244 ATGGCTAAGGAATATTTCCCATTCACCGGTAAGATTCCATTCGAAencoding XI of SEQ GGTAAGGACTCTAAGAACGTTATGGCTTTCCACTATTATGAACCAID NO: 78 GAAAAGGTTGTTATGGGTAAGAAGATGAAGGACTGGTTGAAGTTCGCTATGGCTTGGTGGCACACCTTGGGTGGTGCTTCTGCTGACCAATTCGGTGGTCAAACCAGATCTTATGAATGGGACAAGGCTGGTGACGCTGTTCAAAGAGCTAAGGACAAGATGGACGCTGGTTTCGAAATTATGGACAAGTTGGGTATTGAATATTTCTGTTTCCACGACGTTGACTTGGTTGAAGAAGGTGACACCATTGAAGAATATGAAGCTAGAATGAAGGCTATTACCGACTATGCTCAAGAAAAGATGAAGCAATTCCCAAACATTAAGTTGTTGTGGGGTACTGCTAACGTTTTCGGTAACAAGAGATATGCTAACGGTGCTTCTACCAACCCAGACTTCGACGTTGTTGCTAGAGCTATTGTTCAAATTAAGAACGCTATTGATGCTACCATTAAGTTGGGTGGTACTAACTATGTTTTCTGGGGTGGTAGAGAAGGTTATATGTCTTTGTTGAACACCGACCAAAAGAGAGAAAAGGAACACATGGCTACCATGTTGACCATGGCTAGAGACTATGCTAGAGCTAAGGGTTTCAAGGGTACTTTCTTGATTGAACCAAAGCCAATGGAACCATCTAAGCACCAATATGACGTTGACACCGAAACCGTTATTGGTTTCTTGAAGGCTCACAACTTGGACAAGGACTTCAAGGTTAACATTGAAGTTAACCACGCTACCTTGGCTGGTCACACCTTCGAACACGAATTGGCTTGTGCTGTTGACGCTGGTATGTTGGGTTCTATTGACGCTAACAGAGGTGACGCTCAAAACGGTTGGGACACCGACCAATTCCCAATTGACAACTATGAATTGACCCAAGCTATGTTGGAAATTATTAGAAACGGTGGTTTGGGTAACGGTGGAACCAACTTCGACGCTAAGATTAGAAGAAACTCTACCGACTTGGAAGACTTGTTCATTGCTCACATTTCTGGTATGGACGCTATGGCTAGAGCTTTGATGAACGCTGCTGACATTTTGGAAAACTCTGAATTGCCAGCTATGAAGAAGGCTAGATATGCTTCTTTCGACCAAGGTGTTGGTAAGGACTTCGAAGACGGTAAGTTGACCTTGGAACAAGTTTATGAATATGGTAAGAAGGTTGGTGAACCAAAGCAAACCTCTGGTAAGCAAGAAAAGTATGAAACCATTGTTGCT TTGTATGCTAAGTAACodon optimized DNA 245 ATGGCTAAGGAATATTTCCCATTCATTGGTAAGGTTCCATTCGAAencoding XI of SEQ GGTACTGAATCTAAGAACGTTATGGCTTTCCACTATTATGAACCAID NO: 96 GAAAAGGTTGTTATGGGTAAGAAGATGAAGGACTGGTTGAAGTTCGCTATGGCTTGGTGGCACACCTTGGGTGGTGCTTCTGCTGACCAATTCGGTGGTCAAACCAGATCTTATGAATGGGACAAGGCTGCTGACGCTGTTCAAAGAGCTAAGGACAAGATGGACGCTGGTTTCGAAATTATGGACAAGTTGGGTATTGAATATTTCTGTTTCCACGACGTTGACTTGGTTGAAGAAGGTGAAACCGTTGCTGAATATGAAGCTAGAATGAAGGTTATTACCGACTATGCTTTGGAAAAGATGCAACAATTCCCAAACATTAAGTTGTTGTGGGGTACTGCTAACGTTTTCGGTCACAAGAGATATGCTAACGGTGCTTCTACCAACCCAGACTTCGACGTTGTTGCTAGAGCTATTGTTCAAATTAAGAACGCTATTGATGCTACCATTAAGTTGGGTGGTACTAACTATGTTTTCTGGGGTGGTAGAGAAGGTTATATGTCTTTGTTGAACACCGACCAAAAGAGAGAAAAGGAACACATGGCTACCATGTTGACCATGGCTAGAGACTATGCTAGAGCTAAGGGTTTCAAGGGTACTTTCTTGATTGAACCAAAGCCAATGGAACCATCTAAGCACCAATATGACGTTGACACCGAAACCGTTATTGGTTTCTTGAGGGCTCACGGTTTGGACAAGGACTTCAAGGTTAACATTGAAGTTAACCACGCTACCTTGGCTGGTCACACCTTCGAACACGAATTGGCTTGTGCTGTTGACGCTGGTATGTTGGGTTCTATTGACGCTAACAGAGGTGACGCTCAAAACGGTTGGGACACCGACCAATTCCCAATTGACAACTATGAATTGACCCAAGCTATGATGGAAATTATTAGAAACGGTGGTTTGGGTAACGGTGGAACCAACTTCGACGCTAAGATTAGAAGAAACTCTACCGACTTGGAAGACTTGTTCATTGCTCACATTTCTGGTATGGACGCTATGGCTAGAGCTTTGATGAACGCTGCTGCTATTTTGGAAGAATCTGAATTGCCAGCTATGAAGAAGGCTAGATATGCTTCTTTCGACGAAGGTATTGGTAAGGACTTCGAAGACGGTAAGTTGTCTTTGGAACAAGTTTATGAATATGGTAAGAAGGTTGAAGAACCAAAGCAAACCTCTGGTAAGCAAGAAAAGTATGAAACCATTGTTGCT TTGTATGCTAAGTAA

The resulting plasmids were named pYDABF-0033 (SEQ ID NO:78) andpYDABF-0036 (SEQ ID NO:96), pYDABF-0231 (SEQ ID NO:54) and pYDABF-0232(SEQ ID NO:58).

The rDNA integration cassette was linearized by Pac I restriction enzymedigestion (New England Biolabs Inc., MA, USA) and purified with DNAcolumn purification kit (Zymo Research, Irvine, Calif., USA). Theintegration cassette was transformed into modified haploid S. cerevisiaestrain pBPB007 (MATa::ura3) and pBPB008 (MAT alpha::ura3) using thestandard protocol described in previous examples. Transformants wereplated on SC-xylose (SC complete+2% xylose) agar plates, about 2-3 daysat about 30° C. Colonies that grew on SC-xylose agar plates were thenchecked by colony PCR analysis with primer sets shown in Table 16 (SEQID NOs:228, 229, 230, 231) to confirm the presence of xylose isomerasein the genome.

TABLE 16 Primers Used in Integration Verification SEQ ID Primer NO:Sequence N16PCR_F 228 CCCCATCGACAACTACGAGCTCACT N16PCR_R 229CAACTTGCCGTCCTCGAAGTCCTTG N05PCR_F 230 CGAGCCTGAGAAGGTCGTGATGGGAN05PCR_R 231 TACGTCGAAGTCGGGGTTGGTAGAA N08PCR_F 240TACTTGGCTGTTAAGATGAAG N08PCR_R 241 ATCTAGCAGCCTTCATAGCCTT N17PCR_F 242CGAAGGTACTGACGCTGTTACC N17PCR_R 243 CGAAAGAAGCGTATCTAGCCTT

Confirmed haploid strains were BD31328 (MATa), BD31336 (MATalpha),BD31526 (MATa) and BD31527 (MATalpha), BD34364 (MATa) and BD34365(MATalpha), BD34366 (MATa) and BD34367 (MATalpha). Diploid strainsBD31378 (expressing a xylose isomerase of SEQ ID NO:96), BD31365(expressing a xylose isomerase of SEQ ID NO:78), BD34369 (expressing axylose isomerase of SEQ ID NO:54) and BD34377 (expressing a xyloseisomerase of SEQ ID NO:58) were generated by conventional plate matingon YPXylose (YP+2% xylose) agar plates, about 2 days at about 30° C.Colony PCR with specific primers checking mating types were performed(shown in Table 17) and single colonies having MATa and MATalpha werepicked as diploid strains BD 31378 (SEQ ID NO:96), BD31365 (SEQ IDNO:78), BD34369 (SEQ ID NO:54) and BD34377 (SEQ ID NO:58).

A linear fragment encoding the URA3 sequence (SEQ ID NO:237;TTAATTAAGTTAATTACCTTTTTTGCGAGGCATATTTATGGTGAAGAATAAGTTTTGACCATCAAAGAAGGTTAATGTGGCTGTGGTTTCAGGGTCCATAAAGCTTTTCAATTCATCATTTTTTTTTTATTCTTTTTTTTGATTCCGGTTTCCTTGAAATTTTTTTGATTCGGTAATCTCCGAACAGAAGGAAGAACGAAGGAAGGAGCACAGACTTAGATTGGTATATATACGCATATGTAGTGTTGAAGAAACATGAAATTGCCCAGTATTCTTAACCCAACTGCACAGAACAAAAACCTGCAGGAAACGAAGATAAATCATGTCGAAAGCTACATATAAGGAACGTGCTGCTACTCATCCTAGTCCTGTTGCTGCCAAGCTATTTAATATCATGCACGAAAAGCAAACAAACTTGTGTGCTTCATTGGATGTTCGTACCACCAAGGAATTACTGGAGTTAGTTGAAGCATTAGGTCCCAAAATTTGTTTACTAAAAACACATGTGGATATCTTGACTGATTTTTCCATGGAGGGCACAGTTAAGCCGCTAAAGGCATTATCCGCCAAGTACAATTTTTTACTCTTCGAAGACAGAAAATTTGCTGACATTGGTAATACAGTCAAATTGCAGTACTCTGCGGGTGTATACAGAATAGCAGAATGGGCAGACATTACGAATGCACACGGTGTGGTGGGCCCAGGTATTGTTAGCGGTTTGAAGCAGGCGGCAGAAGAAGTAACAAAGGAACCTAGAGGCCTTTTGATGTTAGCAGAATTGTCATGCAAGGGCTCCCTAGCTACTGGAGAATATACTAAGGGTACTGTTGACATTGCGAAGAGCGACAAAGATTTTGTTATCGGCTTTATTGCTCAAAGAGACATGGGTGGAAGAGATGAAGGTTACGATTGGTTGATTATGACACCCGGTGTGGGTTTAGATGACAAGGGAGACGCATTGGGTCAACAGTATAGAACCGTGGATGATGTGGTCTCTACAGGATCTGACATTATTATTGTTGGAAGAGGACTATTTGCAAAGGGAAGGGATGCTAAGGTAGAGGGTGAACGTTACAGAAAAGCAGGCTGGGAAGCATATTTGAGAAGATGCGGCCAGCAAAACTAAAAAACTGTATTATAAGTAAATGCATGTATACTAAACTCACAAATTAGAGCTTCAATTTAATTATATCAGTTATTACCCGGGAATCTCGGTCGTAATGATTTTTATAATGACGAAAAAAAAAAAATTGGAAAGAAAAAGCTTCATGGCCTTTATAAAAAGGAACCATCCAATACCTCGCCAGAACCAAGTAACAGTATTTTACGGTTAATTAA) was transformed into BD 31378(SEQ ID NO:96), BD31365 (SEQ ID NO:78), BD34369 (SEQ ID NO:54) andBD34377 (SEQ ID NO:58) by a conventional transformation protocol, andtransformants were plated on SCXylose-URA (Synthetic Complete, Uracildropout) for selection. Colonies were checked by PCR with primers shownin Table 17, SEQ ID NO:235, SEQ ID NO:236). Confirmed strains areBD31446 (SEQ ID NO:78), BD31448 (SEQ ID NO:96), BD34373 (SEQ ID NO:54)and BD34378 (SEQ ID NO:58).

TABLE 17 Primers Used in Mating Type Verification SEQ ID Primer NO:Sequence 1-mating type-R 232 AGTCACATCAAGATCGTTTAT 2-mating type  233GCACGGAATATGGGACTACTT alpha-F 3-mating type  234 ACTCCACTTCAAGTAAGAGTTa-F Ura fix-F 235 GAACAAAAACCTGCAGGAAACGAAGAT Ura fix-R 236GCTCTAATTTGTGAGTTTAGTATACATGCAT

Table 18 below shows the genotypes of the resulting yeast strains:

TABLE 18 Strain Construction Name Parent Strain Description pBPB007yBPA130 MATa, ura3, adh2 :: TAL1-XKS1, pho13:: TKL1-XKS1, gre3::RPE1-RKI1 and YLR388.5:: GAL2 pBPB008 yBPA136 MATalpha, ura3, adh2 ::TAL1-XKS1, pho13:: TKL1-XKS1, gre3:: RPE1-RKI1 and YLR388.5:: GAL2BD31328 pBPB007 MATa, ura3, adh2 :: TAL1-XKS1, pho13:: TKL1-XKS1, gre3::RPE1-RKI1 and YLR388.5:: GAL2, rDNA::XI (SEQ ID NO: 96) BD31336 pBPB008MATalpha, ura3, adh2 :: TAL1-XKS1, pho13:: TKL1-XKS1, gre3:: RPE1-RKI1and YLR388.5:: GAL2, rDNA::XI (SEQ ID NO: 96) BD31526 pBPB007 MATa,ura3, adh2 :: TAL1-XKS1, pho13:: TKL1-XKS1, gre3:: RPE1-RKI1 andYLR388.5:: GAL2, rDNA::XI (SEQ ID NO: 78) BD31527 pBPB008 MATalpha,ura3, adh2 :: TAL1-XKS1, pho13:: TKL1-XKS1, gre3:: RPE1-RKI1 andYLR388.5:: GAL2, rDNA::XI (SEQ ID NO: 78) BD34364 pBPB007 MATa, ura3,adh2 :: TAL1-XKS1, pho13:: TKL1-XKS1, gre3:: RPE1-RKI1 and YLR388.5::GAL2, rDNA::XI (SEQ ID NO: 238) BD34365 pBPB008 MATalpha, ura3, adh2 ::TAL1-XKS1, pho13:: TKL1-XKS1, gre3:: RPE1-RKI1 and YLR388.5:: GAL2,rDNA::XI (SEQ ID NO: 238) BD34366 pBPB007 MATa, ura3, adh2 :: TAL1-XKS1,pho13:: TKL1-XKS1, gre3:: RPE1-RKI1 and YLR388.5:: GAL2, rDNA::XI (SEQID NO: 239) BD34367 pBPB008 MATalpha, ura3, adh2 :: TAL1-XKS1, pho13::TKL1-XKS1, gre3:: RPE1-RKI1 and YLR388.5:: GAL2, rDNA::XI (SEQ ID NO:239) BD31378 BD31328 MATa/alpha, ura3, adh2 :: TAL1-XKS1, BD31336pho13:: TKL1-XKS1, gre3:: RPE1-RKI1 and YLR388.5:: GAL2, rDNA::XI (SEQID NO: 96) BD31365 BD31526 MATa/alpha, ura3, adh2 :: TAL1-XKS1, BD31527pho13:: TKL1-XKS1, gre3:: RPE1-RKI1 and YLR388.5:: GAL2, rDNA::XI (SEQID NO: 78) BD34369 BD34364 MATa/alpha, ura3, adh2 :: TAL1-XKS1, BD34365pho13:: TKL1-XKS1, gre3:: RPE1-RKI1 and YLR388.5:: GAL2, rDNA::XI (SEQID NO: 238) BD34377 BD34366 MATa/alpha, ura3, adh2 :: TAL1-XKS1, BD34367pho13:: TKL1-XKS1, gre3:: RPE1-RKI1 and YLR388.5:: GAL2, rDNA::XI (SEQID NO: 239) BD31448 BD31378 MATa/alpha, adh2 :: TAL1-XKS1, pho13::TKL1-XKS1, gre3:: RPE1-RKI1 and YLR388.5:: GAL2, rDNA::XI (SEQ ID NO:96) BD31446 BD31365 MATa/alpha. adh2 :: TAL1-XKS1, pho13:: TKL1-XKS1,gre3:: RPE1-RKI1 and YLR388.5:: GAL2, rDNA::XI (SEQ ID NO: 78)

5.13 Example 12: Fermentation Performance of Yeast Strain ExpressingDifferent Xylose Isomerases

Fermentation performances of two different XI-expressing yeast strainswere evaluated using the DasGip fermentation systems (Eppendorf, Inc.).DasGip fermenters allowed close control over agitation, pH, andtemperature ensuring consistency of the environment during fermentation.DasGip fermenters were used to test performance of the yeast strainsexpressing the XI genes on hydrolysate (Hz) (neutralized with magnesiumbases) as a primary carbon source. Prior to the start of fermentationstrains were subjected to propagation testing consisting of two steps asdescribed below.

Seed 1:

About 1 ml of strain glycerol stock was inoculated into about 100 ml ofYP (Yeast extract, Peptone) medium containing about 2% glucose and about1% xylose in the 250 ml bellco baffled flask (Bellco, Inc.). Strainswere cultivated at about 30° C. with about 200 rpm agitation for atleast 18 hours until at full saturation. Optical density was assessed bymeasuring light absorbance at wavelength of 600 nm.

Seed 2:

About 20 ml of saturated SEED 1 (see preceding paragraph) was inoculatedinto 3 L Bioflo unit (New Brunswick, Inc.) containing about 2.1 L ofbasal medium at pH 6.0 (1% v/v inoculation). Cultivation was conductedat about 30° C. in a fed batch mode with constant air flow of about 2L/min. Agitation ramp (rpm) was about 200-626 rpm over about 15 hoursstarting at about 5 hours of elapsed fermentation time (EFT). Feedingprofile was about 0-4.8 ml/min over 20 hours. The basal medium contained(per 1 L): about 20% of neutralized hydrolysate (Hz); about 20 g/Lsucrose (from cane juice); about 35 ml of nutrients mixture (Table 19),about 1 ml of vitamin mixture (Table 20); about 0.4 ml of antifoam 1410(Dow Corning, Inc.) and water. Feed medium contained (per 1 L): about20% neutralized hydrolysate (Hz), about 110 g/L sucrose (from canejuice), about 35 ml of nutrient mixture; about 1 ml of vitamin mixture,about 0.4 ml of antifoam 1410 (Dow Corning, Inc.) and water.

TABLE 19 Nutrients mixture Component FW g/mol Conc. KH₂PO₄ H₂O 154.199.1 g/L Urea 60.06 65.6 g/L MgSO₄—7H₂O 192.4 14.6 g/L DI Water NA To1.0 L

TABLE 20 Vitamin mixture (1000×) Components mM ZnSO₄ 100 H₃BO₃ 24 KI 1.8MnSO₄ 20 CuSO₄ 10 Na₂MoO₄ 1.5 CoCl₂ 1.5 FeCl₃ 1.23

DasGip Fermentation:

Strains were tested in small scale fermentation using the DasGip systemin the industrially relevant medium containing detoxified hydrolysateand sucrose. Strains were propagated as described above; DasGipinoculation was performed using the following protocol:

Cell dry weight of SEED 2 was assessed based on the final opticaldensity. Cell dry weight and optical density (600 nm) correlation wasused to estimate the volume of the SEED 2 culture needed forfermentation. Targeted inoculation level was about 7% v/v; about 1.5 g/Lcell dry weight. Appropriate volume of SEED 2 culture was harvested bycentrifugation (about 5000 rpm for 10 min) to pellet the cells andresuspended in about 17.5 ml of PBS. Resuspended cell solution was usedto inoculate a 500 ml DasGip unit containing about 250 ml of detoxifiedhydrolysate and nutrient solution (about 3.5 ml/100 ml of medium).Fermentation was performed at about 32° C. at pH 6.3 with about 200 rpm.The duration of fermentation was about 92 hours with regular sampling.Sampling was conducted by a 25 ml steriological pipette through the portin the head plate of the DasGip unit. About 3 ml of culture were takenout, harvested by centrifugation (about 5000 rpm for 10 min) to pelletthe cells and the supernatant was submitted for analysis. Standardanalytical techniques such as high-pressure liquid chromatography (HPLC)were used to determine concentration of sugars and ethanol in themedium. Fermentation performances for yeast strains BD31378 (expressinga xylose isomerase of SEQ ID NO:96) and BD31365 (expressing a xyloseisomerase of SEQ ID NO:78) are presented in FIG. 7A and FIG. 7B,respectively.

Serum Bottle Fermentation:

Fermentation performances of BD34373 (SEQ ID NO:54) and BD34378 (SEQ IDNO:58) were evaluated using the serum bottle fermentation system. NewWheaton Thin-Flng Lyp Stopper (VWR Inc., PA, USA) wrap individual 125 mLAnaerobic Media bottles (VWR Inc., PA, USA) allowed close control overagitation, pH, and temperature ensuring consistency of the environmentduring fermentation. Serum bottle fermentations were used to testperformance of the yeast strains expressing XI genes on clean sugarmedia (see Table 21 (below), supplemented with nutrients (Table 19,above) and vitamins (Table 20, above)).

TABLE 21 Clean sugar Component Conc. Glucose  80 g/L Xylose  80 g/LArabinose 5.0 g/L Acetic acid 8.0 g/L

Yeast cells were inoculated into about 200 ml of YP (Yeast extract,Peptone) medium containing about 0.5% glucose and about 3% xylose in the500 ml bellco baffled flask (Bellco, Inc.). Strains were cultivated atabout 30° C. with about 200 rpm agitation for at least 18 hours until atfull saturation. Optical density was assessed by measuring lightabsorbance at wavelength of 600 nm. Targeted inoculation level was about1.5 g/L cell dry weight. Appropriate volume of SEED culture washarvested by centrifugation (about 3500 rpm for 10 min) to pellet thecells and resuspended in about 20 ml of media. Resuspended cell solutionwas used to inoculate a 150 ml serum bottle containing about 90 ml offermentation media. The autoclaved stopper was placed into serum bottlesand then the serum bottles were clamped with aluminium seals (BellcoInc., USA) by using a seal crimper (Bellco Inc., USA). The aluminiumseal cap was peeled off and then the needle (Fisher, USA) inserted.Fermentation was performed at about 35° C., pH 5.5 with about 200 rpm.The duration of fermentation was about 44 hours. Fermentationperformances for yeast strains BD34373 (SEQ ID NO:54) is presented inFIG. 7C and BD34378 (SEQ ID NO:58) are presented in FIG. 7D.

5.14 Example 13: Comparative Activity of XI's Encoded by Codon Optimizedvs. Non-Optimized Open Reading Frames

The coding sequences for the XIs of SEQ ID NOs:38, 78 and 96 weresubject to codon optimization using two approaches: the Boles codonoptimization method and the DNA 2.0 Gene Designer software. The codonoptimized sequences are set forth in Table 22 below:

TABLE 22 SEQ Sequence ID Description NO: Sequence Boles codon optimized244 ATGGCTAAGGAATATTTCCCATTCACCGGTAAGATTCCATTCGAA DNA encoding XI ofGGTAAGGACTCTAAGAACGTTATGGCTTTCCACTATTATGAACCA SEQ ID NO: 78GAAAAGGTTGTTATGGGTAAGAAGATGAAGGACTGGTTGAAGTTCGCTATGGCTTGGTGGCACACCTTGGGTGGTGCTTCTGCTGACCAATTCGGTGGTCAAACCAGATCTTATGAATGGGACAAGGCTGGTGACGCTGTTCAAAGAGCTAAGGACAAGATGGACGCTGGTTTCGAAATTATGGACAAGTTGGGTATTGAATATTTCTGTTTCCACGACGTTGACTTGGTTGAAGAAGGTGACACCATTGAAGAATATGAAGCTAGAATGAAGGCTATTACCGACTATGCTCAAGAAAAGATGAAGCAATTCCCAAACATTAAGTTGTTGTGGGGTACTGCTAACGTTTTCGGTAACAAGAGATATGCTAACGGTGCTTCTACCAACCCAGACTTCGACGTTGTTGCTAGAGCTATTGTTCAAATTAAGAACGCTATTGATGCTACCATTAAGTTGGGTGGTACTAACTATGTTTTCTGGGGTGGTAGAGAAGGTTATATGTCTTTGTTGAACACCGACCAAAAGAGAGAAAAGGAACACATGGCTACCATGTTGACCATGGCTAGAGACTATGCTAGAGCTAAGGGTTTCAAGGGTACTTTCTTGATTGAACCAAAGCCAATGGAACCATCTAAGCACCAATATGACGTTGACACCGAAACCGTTATTGGTTTCTTGAAGGCTCACAACTTGGACAAGGACTTCAAGGTTAACATTGAAGTTAACCACGCTACCTTGGCTGGTCACACCTTCGAACACGAATTGGCTTGTGCTGTTGACGCTGGTATGTTGGGTTCTATTGACGCTAACAGAGGTGACGCTCAAAACGGTTGGGACACCGACCAATTCCCAATTGACAACTATGAATTGACCCAAGCTATGTTGGAAATTATTAGAAACGGTGGTTTGGGTAACGGTGGAACCAACTTCGACGCTAAGATTAGAAGAAACTCTACCGACTTGGAAGACTTGTTCATTGCTCACATTTCTGGTATGGACGCTATGGCTAGAGCTTTGATGAACGCTGCTGACATTTTGGAAAACTCTGAATTGCCAGCTATGAAGAAGGCTAGATATGCTTCTTTCGACCAAGGTGTTGGTAAGGACTTCGAAGACGGTAAGTTGACCTTGGAACAAGTTTATGAATATGGTAAGAAGGTTGGTGAACCAAAGCAAACCTCTGGTAAGCAAGAAAAGTATGAAACCATTGTTGCT TTGTATGCTAAGTAABoles codon optimized 245 ATGGCTAAGGAATATTTCCCATTCATTGGTAAGGTTCCATTCGAADNA encoding XI of GGTACTGAATCTAAGAACGTTATGGCTTTCCACTATTATGAACCASEQ ID NO: 96 GAAAAGGTTGTTATGGGTAAGAAGATGAAGGACTGGTTGAAGTTCGCTATGGCTTGGTGGCACACCTTGGGTGGTGCTTCTGCTGACCAATTCGGTGGTCAAACCAGATCTTATGAATGGGACAAGGCTGCTGACGCTGTTCAAAGAGCTAAGGACAAGATGGACGCTGGTTTCGAAATTATGGACAAGTTGGGTATTGAATATTTCTGTTTCCACGACGTTGACTTGGTTGAAGAAGGTGAAACCGTTGCTGAATATGAAGCTAGAATGAAGGTTATTACCGACTATGCTTTGGAAAAGATGCAACAATTCCCAAACATTAAGTTGTTGTGGGGTACTGCTAACGTTTTCGGTCACAAGAGATATGCTAACGGTGCTTCTACCAACCCAGACTTCGACGTTGTTGCTAGAGCTATTGTTCAAATTAAGAACGCTATTGATGCTACCATTAAGTTGGGTGGTACTAACTATGTTTTCTGGGGTGGTAGAGAAGGTTATATGTCTTTGTTGAACACCGACCAAAAGAGAGAAAAGGAACACATGGCTACCATGTTGACCATGGCTAGAGACTATGCTAGAGCTAAGGGTTTCAAGGGTACTTTCTTGATTGAACCAAAGCCAATGGAACCATCTAAGCACCAATATGACGTTGACACCGAAACCGTTATTGGTTTCTTGAGGGCTCACGGTTTGGACAAGGACTTCAAGGTTAACATTGAAGTTAACCACGCTACCTTGGCTGGTCACACCTTCGAACACGAATTGGCTTGTGCTGTTGACGCTGGTATGTTGGGTTCTATTGACGCTAACAGAGGTGACGCTCAAAACGGTTGGGACACCGACCAATTCCCAATTGACAACTATGAATTGACCCAAGCTATGATGGAAATTATTAGAAACGGTGGTTTGGGTAACGGTGGAACCAACTTCGACGCTAAGATTAGAAGAAACTCTACCGACTTGGAAGACTTGTTCATTGCTCACATTTCTGGTATGGACGCTATGGCTAGAGCTTTGATGAACGCTGCTGCTATTTTGGAAGAATCTGAATTGCCAGCTATGAAGAAGGCTAGATATGCTTCTTTCGACGAAGGTATTGGTAAGGACTTCGAAGACGGTAAGTTGTCTTTGGAACAAGTTTATGAATATGGTAAGAAGGTTGAAGAACCAAAGCAAACCTCTGGTAAGCAAGAAAAGTATGAAACCATTGTTGCT TTGTATGCTAAGTAABoles codon optimized 246 ATGAAGGAAATTTTCCCAAACATTCCAGAAATTAAGTTCGAAGGTDNA encoding XI of AAGGACTCTAAGAACCCATTCGCTTTCCACTATTATAACCCAGACSEQ ID NO: 38 CAAATTATTTTGGGTAAGCCAATGAAGGAACACTTGCCATTCGCTATGGCTTGGTGGCACAACTTGGGTGCTACCGGTGTTGACATGTTCGGTGCTGGTCCAGCTGACAAGTCTTTCGGTGCTAAGGTTGGTACTATGGAACACGCTAAGGCTAAGGTTGACGCTGGTTTCGAATTCATGAAGAAGTTGGGTATTAGATATTTCTGTTTCCACGACGTTGACTTGGTTCCAGAATGTGCTGACATTAAGGACACCAACAAGGAATTGGACGAAATTTCTGACTATATTTTGGAAAAGATGAAGGGTACTGACATTAAGTGTTTGTGGGGTACTGCTAACATGTTCTCTAACCCAAGATTCTGTAACGGTGCTGGTTCTACCAACTCTGCTGACGTTTTCGCTTTCGCTGCTGCTCAAGTTAAGAAGGCTTTGGACATTACCGTTAAGTTGGGTGGTAGAGGTTATGTTTTCTGGGGTGGTAGAGAAGGTTATGAAACCTTGTTGAACACCGACGTTAAGTTCGAACAAGAAAACATTGCTAGATTGATGAAGATGGCTGTTGAATATGGTAGATCTATTGGTTTCAAGGGTGACTTCTATATTGAACCAAAGCCAAAGGAACCAATGAAGCACCAATATGACTTCGACGCTGCTACCGCTATTGGTTTCTTGAGGGCTCACGGTTTGGACAAGGACTTCAAGTTGAACATTGAAGCTAACCACGCTACCTTGGCTGGTCACACCTTCCAACACGACTTGAGAATTTCTGCTATTAACGGTATGTTGGGTTCTATTGACGCTAACCAAGGTGACATGTTGTTGGGTTGGGACACCGACGAATTCCCATTCGACGTTTATTCTGCTACCCAATGTATGTATGAAGTTTTGAAGAACGGTGGTTTGACCGGTGGTTTCAACTTCGACTCTAAGACCAGAAGACCATCTTATACCATGGAAGACATGTTCTTGGCTTATATTTTGGGTATGGACACCTTCGCTTTGGGTTTGATTAAGGCTGCTCAAATTATTGAAGACGGTAGAATTGACCAATTCATTGAAAAGAAGTATTCTTCTTTCAGAGAAACCGAAATTGGTCAAAAGATTTTGAACAACAAGACCTCTTTGAAGGAATTGTCTGACTATGCTTGTAAGATGGGTGCTCCAGAATTGCCAGGTTCTGGTAGACAAGAAATGTTGGAAGCTATTGTTAACGAC GTTTTGTTCGGTAAGTAADNA 2.0 codon 247 ATGGCTAAGGAATACTTTCCATTCACCGGAAAGATACCATTTGAAoptimized DNA GGTAAAGATTCTAAAAACGTAATGGCTTTTCATTATTACGAACCAencoding XI of SEQ GAAAAAGTTGTTATGGGCAAAAAGATGAAAGATTGGTTGAAATTTID NO: 78 GCGATGGCTTGGTGGCATACACTCGGGGGAGCTTCCGCTGATCAATTTGGCGGACAAACCAGATCATACGAATGGGATAAAGCAGGCGATGCCGTGCAGAGAGCAAAGGATAAAATGGATGCTGGTTTCGAAATTATGGATAAGCTAGGTATCGAATACTTCTGCTTCCATGACGTCGATTTGGTTGAAGAGGGCGATACTATCGAGGAATACGAGGCGAGAATGAAGGCTATAACAGACTACGCCCAGGAGAAAATGAAACAATTTCCTAACATCAAATTACTCTGGGGTACTGCCAATGTGTTTGGTAACAAAAGATACGCAAACGGGGCTTCAACTAATCCTGACTTCGATGTTGTTGCAAGAGCCATTGTTCAAATCAAAAACGCGATAGACGCTACTATTAAACTAGGTGGCACGAATTACGTCTTTTGGGGTGGAAGGGAAGGTTACATGTCTCTGCTTAATACAGATCAGAAGAGAGAGAAGGAACACATGGCAACAATGCTCACTATGGCCCGTGACTACGCAAGAGCAAAAGGTTTTAAGGGCACTTTCCTTATCGAACCAAAGCCTATGGAACCATCAAAACACCAATATGATGTTGACACAGAAACTGTGATCGGCTTTTTGAAAGCTCATAACTTGGACAAGGATTTCAAAGTAAACATTGAAGTTAATCATGCTACACTAGCAGGACACACATTTGAACACGAACTGGCCTGTGCGGTAGATGCAGGGATGCTGGGTTCTATCGACGCTAATAGAGGGGATGCTCAAAATGGTTGGGATACCGATCAATTTCCAATCGACAATTACGAATTAACACAAGCTATGTTGGAGATTATTAGAAATGGAGGTTTGGGTAATGGGGGTACAAACTTCGATGCTAAGATTCGTCGAAATTCCACAGACTTAGAAGATTTGTTCATTGCGCATATATCTGGTATGGATGCTATGGCCAGAGCATTAATGAATGCCGCTGACATCTTAGAAAACAGTGAACTTCCAGCAATGAAAAAGGCCAGATATGCCTCTTTCGATCAAGGTGTAGGAAAAGATTTTGAGGACGGCAAGTTGACTTTAGAACAAGTCTATGAATACGGTAAAAAGGTCGGCGAACCTAAGCAAACCAGCGGAAAGCAAGAGAAATACGAGACTATCGTGGCT CTTTATGCAAAATAADNA 2.0 codon 248 ATGGCCAAGGAGTACTTCCCTTTTATCGGCAAGGTCCCATTTGAAoptimized DNA GGGACAGAATCCAAAAACGTCATGGCTTTTCACTACTATGAACCTencoding XI of SEQ GAGAAGGTAGTTATGGGTAAAAAGATGAAAGATTGGTTGAAGTTTID NO: 96 GCAATGGCATGGTGGCATACCTTGGGTGGGGCCTCTGCTGATCAATTTGGAGGACAAACTAGATCATACGAATGGGATAAAGCAGCTGATGCCGTTCAAAGAGCCAAAGATAAAATGGATGCCGGGTTCGAAATCATGGACAAATTGGGTATCGAATATTTCTGCTTCCATGATGTAGACCTTGTTGAGGAGGGTGAAACCGTCGCTGAATATGAGGCGAGAATGAAGGTTATTACGGATTACGCACTAGAAAAGATGCAGCAGTTTCCAAACATAAAACTATTGTGGGGTACTGCTAATGTTTTCGGACATAAACGTTACGCTAACGGAGCTTCCACTAATCCAGACTTTGATGTTGTCGCGAGAGCTATCGTTCAAATCAAAAATGCAATCGATGCTACAATTAAGTTAGGAGGGACAAATTACGTGTTCTGGGGTGGTAGAGAAGGTTACATGAGCCTGCTTAATACAGATCAAAAGAGAGAAAAGGAGCACATGGCAACAATGCTAACAATGGCTAGAGATTATGCCCGAGCTAAGGGCTTCAAAGGCACTTTTCTGATAGAACCTAAACCAATGGAACCATCTAAACACCAATACGATGTAGACACCGAAACTGTAATAGGCTTCCTTCGTGCACATGGTTTGGATAAAGATTTTAAGGTGAACATTGAAGTGAATCATGCTACTTTAGCCGGTCACACTTTTGAACATGAATTAGCATGTGCTGTTGATGCGGGAATGTTGGGTTCTATCGATGCCAACAGAGGCGACGCCCAAAATGGTTGGGACACAGACCAGTTTCCTATTGACAATTACGAACTCACCCAAGCTATGATGGAAATTATCAGGAATGGGGGACTGGGAAATGGTGGTACGAACTTTGATGCGAAGATAAGGAGAAACTCTACTGACTTAGAAGATTTGTTTATAGCACATATTTCAGGTATGGACGCTATGGCAAGAGCTTTAATGAATGCCGCAGCAATCTTGGAGGAAAGTGAACTCCCAGCTATGAAAAAGGCAAGATACGCAAGTTTTGATGAGGGTATTGGCAAAGACTTCGAAGATGGTAAACTATCTTTAGAACAAGTGTACGAGTATGGCAAAAAGGTAGAGGAACCAAAACAAACATCAGGCAAACAAGAGAAATATGAAACAATTGTCGCT CTTTACGCGAAGTAADNA 2.0 codon 249 ATGAAGGAAATCTTCCCTAACATCCCAGAGATCAAATTCGAAGGCoptimized DNA AAAGACTCTAAAAATCCATTTGCCTTCCACTATTACAACCCAGACencoding XI of SEQ CAGATCATTTTAGGTAAACCAATGAAGGAGCACTTGCCATTTGCTID NO: 38 ATGGCTTGGTGGCATAATCTAGGCGCCACTGGTGTTGATATGTTTGGTGCAGGCCCTGCGGACAAATCTTTCGGAGCTAAAGTAGGAACTATGGAACATGCAAAAGCGAAAGTTGATGCTGGGTTTGAGTTCATGAAGAAATTAGGAATCAGATATTTCTGCTTTCATGATGTTGACTTGGTTCCTGAGTGTGCTGACATTAAGGATACAAACAAGGAACTTGATGAAATCTCTGACTACATTTTGGAAAAGATGAAAGGTACTGACATAAAGTGTTTGTGGGGCACGGCTAATATGTTTTCCAATCCAAGATTTTGTAACGGCGCTGGCTCAACTAATTCAGCAGATGTCTTTGCATTCGCTGCTGCACAAGTCAAGAAAGCACTTGACATTACAGTCAAACTGGGTGGGAGAGGATACGTTTTCTGGGGTGGTAGAGAAGGCTACGAAACATTGTTGAATACAGACGTTAAGTTTGAACAAGAGAATATTGCAAGGTTAATGAAAATGGCAGTGGAATATGGGCGTTCTATAGGTTTTAAAGGTGATTTCTACATTGAGCCAAAACCAAAGGAACCTATGAAACATCAATACGATTTCGATGCCGCAACAGCAATAGGTTTCCTTAGAGCCCACGGGTTGGATAAAGACTTTAAGCTCAATATCGAAGCCAACCACGCAACACTTGCAGGCCATACATTTCAACATGATCTTAGAATATCTGCTATTAACGGAATGCTCGGCTCAATTGATGCCAATCAGGGTGATATGCTACTAGGTTGGGATACTGATGAGTTTCCATTTGATGTATACTCCGCTACACAATGCATGTATGAGGTGTTGAAAAATGGTGGTCTGACCGGTGGCTTCAACTTCGATAGTAAGACCAGACGTCCTTCATACACTATGGAAGATATGTTTCTGGCGTATATCTTAGGTATGGACACATTTGCTTTAGGTCTAATCAAAGCCGCTCAAATCATTGAAGATGGCAGAATTGACCAGTTTATAGAAAAGAAATACTCCAGTTTTCGAGAAACCGAAATCGGACAAAAGATTCTCAATAACAAAACTTCATTGAAGGAATTATCTGATTACGCCTGTAAGATGGGTGCGCCAGAATTACCTGGAAGCGGTAGACAAGAGATGCTTGAAGCTATCGTGAATGAT GTATTGTTTGGAAAATAA

The codon optimized DNA sequences were synthesized and incorporated intoexpression cassettes substantially as described in Example 12. Yeaststrains were generated that included single copies of individual XI openreading frames integrated into the yeast YER131.5 locus. Strainsconfirmed to contain the XI expression cassettes were inoculated intoabout 3 ml of modified YP Media (YP+0.1% Glucose+3.0% Xylose) andincubated overnight at about 30° C. and about 220 rpm. These overnightcultures were subcultured into about 25 ml of the same media to aboutOD₆₀₀=0.2. Samples were incubated overnight at about 30° C. and about220 rpm. Cultures were harvested when OD₆₀₀ was between about 3 and 4.Pellets were collected by centrifugation for about 5 minutes at about4000 rpm. The supernatants were discarded and pellets washed with about25 ml of distilled-deionized water and centrifuged again using the sameconditions. Supernatants were discarded and the pellet frozen at about−20° C. until lysis and characterization.

Cell pellets were thawed and about 200 mg of each pellet sample wasweighed out into 2 ml microcentrifuge tubes. About 50 μl of Complete®,EDTA-free Protease Inhibitor cocktail (Roche Part#11873 580 001) at 5times the concentration stated in the manufacturer's protocol was addedto each sample. To this was added about 0.5 ml of Y-PER Plus® DialyzableYeast Protein Extraction Reagent (Thermo Scientific Part#78999) (YP+) toeach sample. Samples were incubated at about 25° C. for about 4 hours onrotating mixer. Sample supernatants were collected after centrifugationat about 10,000×g for about 10 minutes for characterization.

Total protein concentrations of the XI sample extracts prepared abovewere carried out using Bio-Rad Protein Assay Dye Reagent Concentrate(Bio-Rad, cat#500-0006, Hercules Calif.) which is a modified version ofthe Bradford method (Bradford). In this assay, optical density readingswere taken on a spectrophotometer set to 595 nm, and the standard curvewas plotted as a linear regression line.

XI activity was determined using assay conditions at pH 7.5 as describedin the Section 5.1.2. The specific activities of the codon optimized Xisare shown in Table 23.

TABLE 23 Protein Nucleic Acid SA, pH 7.5 SEQ ID NO. SEQ ID NO: Codonoptimization method (U/mg) 78 77 Native 0.43 78 247 DNA2.0—CodonOptimization 0.14 78 244 Boles Codon Optimization 0.53 96 95 Native 0.4296 248 DNA2.0—Codon Optimization 0.14 96 245 Boles Codon Optimization0.40 38 37 Native 0.25 38 249 DNA2.0—Codon Optimization 0.27 38 246Boles Codon Optimization 0.40

Because these specific activity data are determined on the basis of thetotal cellular protein mass, any variations in specific activity for anygiven XI are due to expression levels. These data demonstrate that theBoles codon optimization approach improves the expressibility ofbacterial XIs in S. cerevisiae.

5.15 Example 14: Comparative Activity of XI's of the Disclosure vs.Orpinomyces sp. XI

The specific activities of exemplary XIs of the disclosure were comparedto the specific activity of a known XI, Orpinomyces sp. XI assignedGenbank Accession No. 169733248. The XIs were incorporated intoexpression cassettes substantially as described in Example 12. Yeaststrains were generated that included single copies of individual XI openreading frames integrated into the yeast YER131.5 locus. Activity of theindividual clones was measured at pH 7.5 using a similar approach tothat used in Example 12, except that the total protein concentrationswere based on optical density readings at 450 nm and 595 nm, with thestandard curve was plotted as a parametric fit. Results are shown inTable 24, below.

TABLE 24 Nucleic % activity as Acid SA, pH compared to Protein SEQ ID7.5 Orpinomyces SEQ ID NO. NO: Codon optimization method (U/mg) XI NA NAHost negative control 0.005   7% (no recombinant XI) Genbank AccessionNo. Orpinomyces sp. 0.071 100% 169733248 XI (Native) 96 95 Native 0.137193% 54 238 Boles codon optimization 0.193 272% 58 239 Boles codonoptimization 0.205 288%

While various specific embodiments have been illustrated and described,it will be appreciated that various changes can be made withoutdeparting from the spirit and scope of the invention(s).

What is claimed is:
 1. An isolated nucleic acid sequence comprising anucleotide sequence having at least 90%, at least 93%, at least 95%, atleast 96%, at least 98%, or at least 99% sequence identity, or having100% sequence identity, to the nucleotide sequence of SEQ ID NO: 245 orto a portion thereof encoding a xylose isomerase catalytic ordimerization domain, wherein the nucleic acid encodes a polypeptide withat least one substitution relative to the polypeptide having thesequence of SEQ ID NO: 96 or at least one heterologous amino acidflanking the N-terminal or the C-terminal.
 2. The nucleic acid sequenceof claim 1, which is codon optimized for expression in a eukaryoticcell.
 3. The nucleic acid sequence of claim 1, which is codon optimizedfor expression in yeast.
 4. A vector comprising the nucleic acidsequence of claim
 1. 5. The vector of claim 4, which further comprisesan origin of replication.
 6. The vector of claim 4, which furthercomprises a promoter sequence operably linked to said nucleic acidsequence.
 7. The vector of claim 6, wherein the promoter sequence isoperable in yeast.
 8. The vector of claim 6, wherein the promotersequence is operable in filamentous fungi.
 9. A recombinant cellengineered to express a polypeptide encoded by the nucleic acid sequenceof claim
 1. 10. The recombinant cell of claim 9, wherein the cell is aeukaryotic cell.
 11. The recombinant cell of claim 9, wherein the cellis a yeast cell.
 12. The recombinant cell of claim 11, wherein the yeastcell is a yeast cell of the genus Saccharomyces, Kluyveromyces, Candida,Pichia, Schizosaccharomyces, Hansenula, Klockera, Schwanniomyces,Issatchenkia or Yarrowia.
 13. The recombinant cell of claim 11, whereinthe yeast cell is of the species S. cerevisiae, S. bulderi, S. barnetti,S. exiguus, S. uvarum, S. diastaticus, K lactis, K. marxiames or K.fragili, or Issatchenkia orientalis.
 14. The recombinant cell of claim13, wherein the cell is a S. cerevisiae cell.
 15. The recombinant cellof claim 14, comprising one or more genetic modifications resulting inat least one, any two, any three, any four or all of the followingphenotypes: (a) an increase in transport of xylose into the cell; (b) anincrease in xylulose kinase activity; (c) an increase in aerobic growthrate on xylose; (d) an increase in flux through the pentose phosphatepathway into glycolysis; (e) a decrease in aldose reductase activity;(f) a decrease in sensitivity to catabolite repression; (g) an increasein tolerance to ethanol, intermediates, osmolality or organic acids; and(h) a reduced production of byproducts.
 16. The recombinant cell ofclaim 15, wherein one or more genetic modifications result in increasedexpression levels of one or more of a hexose or pentose transporter, axylulose kinase, an enzyme from the pentose phosphate pathway, aglycolytic enzyme and an ethanologenic enzyme.
 17. A host celltransformed with the vector of claim
 4. 18. The host cell of claim 17which is a prokaryotic cell.
 19. The host cell of claim 18 which is abacterial cell.
 20. The host cell of claim 17 which is a eukaryoticcell.